arXiv

Automated Conjecture Resolution with Formal Verification

Title: Automated Conjecture Resolution with Formal Verification

Abstract:

The capabilities of large language models in mathematical reasoning have advanced substantially, progressing from basic problem-solving tasks to tackling complex, research-oriented challenges. Nevertheless, the inherent ambiguity of natural language poses significant hurdles for the reliable resolution and verification of such problems. To address this, we introduce an automated framework that synergizes natural language reasoning with formal verification methods to solve research-level mathematical inquiries. This system comprises two distinct agents: an informal reasoning agent named Rethlas and a formal verification agent called Archon.

Rethlas leverages reasoning primitives alongside our theorem search engine, Matlas, to investigate potential solution strategies and generate candidate proofs. Meanwhile, Archon utilizes LeanSearch to convert informal arguments into fully formalized Lean 4 projects. Through a process involving task decomposition, iterative refinement, and automated proof synthesis, Archon guarantees that the resulting proofs are machine-checkable for correctness.

By employing this framework, we successfully resolved an open problem in commutative algebra and formally verified the proof in Lean 4 with minimal human intervention. Supplementary case studies highlight Rethlas’s proficiency in informal mathematical reasoning and discovery, as well as Archon’s capacity to formalize research-level proofs within Lean 4. Our experimental results indicate that robust theorem retrieval tools facilitate the discovery and application of mathematical techniques across different domains, while the formal agent can independently bridge nontrivial gaps in informal arguments. More broadly, this study presents a viable paradigm for mathematical research, demonstrating how informal and formal reasoning systems, augmented by theorem retrieval tools, can work together to generate verifiable outcomes, decrease human workload, and foster human-AI collaborative research.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...