Global News Digest

arXiv

FVSpec: Real-World Property-Based Tests as Lean Challenges

Title: FVSpec: Transforming Real-World Property-Based Tests into Lean Challenges

Abstract

This paper introduces a new benchmark designed to assess the capabilities of AI models and agents in handling genuine formal software verification tasks. The dataset was constructed by extracting 11,039 property-based tests (PBTs) from active Python repositories. Of these, 2,772 (representing 25%) were automatically converted into 9,415 Lean 4 specifications, which include sorry placeholders to indicate incomplete proofs. This process yields an average of approximately three formalizations for every original PBT; when multiple attempts were generated, we retained several versions rather than selecting a single winner, ensuring diversity in quality metrics.

The translation of PBTs into Lean specifications presents significant hurdles. It demands the accurate modeling of Python semantics within Lean, the deduction of logical properties hidden within imperative test structures, and the navigation of the complex, dependent-typed programming paradigm inherent to a language that is not widely adopted. To address these challenges, we detail a three-agent LLM pipeline responsible for transpiling PBTs into Lean. We also present evaluations of coverage and quality, alongside baseline results for proof generation using both automated and model-based techniques. The complete codebase, including the scraper and agent implementations, as well as the full dataset of PBTs and Lean specifications, is made available as open source. Ultimately, this benchmark seeks to advance research in AI-assisted formal verification for real-world software—a critical area of growing importance as artificial intelligence increasingly contributes to global codebases.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.