Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean
Title: Balancing Expense and Efficacy in Agentic Theorem Proving within Lean
Abstract:
Large language models (LLMs) are becoming a staple in processes designed to generate formal proofs in Lean. These procedures typically involve breaking down complex problems into manageable lemmas, generating numerous proof iterations, and leveraging compiler feedback to direct the search process. However, such methods can incur prohibitive costs, as significant computational resources are frequently wasted on attempts that do not succeed. To mitigate this issue, we introduce an action routing agent featuring a dual-component architecture: a data plane and a control plane. The data plane is responsible for producing natural-language decompositions of lemmas, translating them into Lean syntax, and sampling proof attempts for the resulting theorem and lemma objectives. Meanwhile, the control plane monitors prior unsuccessful Lean attempts to estimate both the probability of success and the expense of further efforts. Based on these metrics, it determines whether to persist with the current proof target or revert to a fresh decomposition. Our evaluations on a segment of PutnamBench demonstrate that this agent reduces costs by an average of 25.8% compared to a fixed-step baseline, maintaining equivalent performance levels while consuming significantly fewer computational resources. These findings indicate that unsuccessful Lean trajectories offer valuable insights for optimizing resource allocation in a cost-conscious manner during agentic theorem proving.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




