Paper Arxiv | GGLab

Our new paper is available on arXiv!

Our paper entitled FormalRewardBench: A Benchmark for Formal Theorem Proving Reward Models is available on arXiv. We introduce the first benchmark for evaluating reward models in formal theorem proving with Lean 4, consisting of 250 preference pairs generated through five expert-curated error injection strategies.