arXiv

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

Title: LEAP: Enhancing LLMs in Formal Mathematics via Agentic Architectures

Abstract:

While Large Language Models (LLMs) demonstrate proficiency in informal mathematical reasoning, they frequently encounter difficulties in producing mechanically verifiable proofs within formal languages such as Lean. To address this limitation, we introduce LEAP, an agentic framework designed to empower general-purpose foundation models with state-of-the-art capabilities in automated formal theorem proving. LEAP capitalizes on inherent model strengths, including informal reasoning, instruction adherence, and iterative self-refinement. The system facilitates a bridge between informal blueprints and formal proof construction by breaking down intricate problems into manageable components and engaging in continuous interaction with the Lean compiler.

To establish a rigorous evaluation standard beyond increasingly saturated benchmarks, we present Lean-IMO-Bench. This new benchmark consists of IMO-style problems formalized in Lean, featuring concise problem statements that demand highly non-routine, multi-step proofs spanning a broad spectrum of difficulties.

Empirical results highlight LEAP’s exceptional performance. In the 2025 Putnam Competition—a prestigious annual mathematics contest for North American undergraduates—LEAP successfully solved all 12 problems, aligning with recent achievements by leading formal mathematical models. On the Lean-IMO-Bench, LEAP significantly elevates the one-shot formal solve rate of general-purpose LLMs from under 10% to 70%, substantially outperforming the 48% rate established by a specialized, gold-medal-caliber IMO system. Additionally, we showcase LEAP’s capacity for research-level applications by autonomously formalizing complex proofs for open combinatorial problems, notably including a verified proof for a critical subproblem in Knuth’s Hamiltonian decomposition of even-order Cayley graphs.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...