arXiv

Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean

Title: Balancing Expense and Efficacy in Agentic Theorem Proving within Lean

Abstract:

Large language models (LLMs) are becoming a staple in processes designed to generate formal proofs in Lean. These procedures typically involve breaking down complex problems into manageable lemmas, generating numerous proof iterations, and leveraging compiler feedback to direct the search process. However, such methods can incur prohibitive costs, as significant computational resources are frequently wasted on attempts that do not succeed. To mitigate this issue, we introduce an action routing agent featuring a dual-component architecture: a data plane and a control plane. The data plane is responsible for producing natural-language decompositions of lemmas, translating them into Lean syntax, and sampling proof attempts for the resulting theorem and lemma objectives. Meanwhile, the control plane monitors prior unsuccessful Lean attempts to estimate both the probability of success and the expense of further efforts. Based on these metrics, it determines whether to persist with the current proof target or revert to a fresh decomposition. Our evaluations on a segment of PutnamBench demonstrate that this agent reduces costs by an average of 25.8% compared to a fixed-step baseline, maintaining equivalent performance levels while consuming significantly fewer computational resources. These findings indicate that unsuccessful Lean trajectories offer valuable insights for optimizing resource allocation in a cost-conscious manner during agentic theorem proving.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Glazer Family Members Said to Study Manchester United Stake Sale
Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines
Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion
Bloomberg

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion

Nvidia-backed robotics startup Generalist AI has reached a $2 billion valuation. Founders Pete Florence, Andy Zeng, and ...

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...

Financial Times

How AI has de-skilled translation

AI fragments specialist translation into routine tasks, effectively de-skilling the profession. This shift reduces compl...

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...