Codifying the Judge: Scalable Evaluation via Program Distillation ($\textsc{Pajama}$) Shengqi Qiu*, Tzu-Heng Huang*, Frederic Sala. In submission. paper | code