arXiv

Parthenon Law: A Self-Evolving Legal-Agent Framework

Title: Parthenon Law: A Self-Evolving Legal-Agent Framework

Abstract:

As legal-domain large language model (LLM) agents become increasingly sophisticated, they hold the potential to transform document-intensive processes into manageable, reviewable outputs. However, their reliable deployment is currently hindered by three primary challenges: the absence of large-scale empirical evidence regarding the performance of today’s most advanced model-and-harness combinations on complete legal matters; the lack of agent architectures specifically tailored to the legal sector, with existing solutions relying on general-purpose frameworks; and the inability of systems to learn from their own outcomes in dynamic environments characterized by evolving facts, authorities, and deadlines. This paper addresses each of these gaps.

First, we present a large-scale empirical study conducted on Harvey LAB, analyzing 12,510 agent trajectories. The findings reveal that even frontier-level agents struggle to resolve matters in a single pass. While per-criterion accuracy improves with more powerful models, the rate of strict matter completion remains stagnant.

To overcome these limitations, we introduce \textsc{Parthenon}, a self-evolving legal-agent framework. This architecture decomposes the system into auditable components comprising the Model, Harness, Agent roles, legal Knowledge, deterministic Tools, and procedural Skills. These components are designed to ensure source traceability, accurate grounding of dates and numbers, compliance with deliverable standards, and effective issue closure.

Furthermore, \textsc{Parthenon} incorporates an anti-leakage learning loop. This mechanism transforms scored failures into task-agnostic adjustments to skills, tools, and knowledge bases. This allows the system to enhance its performance through experience—similar to how a law firm refines its checklists and playbooks after each case—without requiring modifications to the underlying model weights. Our extensive empirical analysis demonstrates that \textsc{Parthenon} significantly boosts the performance of state-of-the-art models and harnesses on legal-matter tasks.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia
Bloomberg

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia

Cerebras confirmed partnerships with all major AI hardware vendors except Nvidia. This broad engagement positions Cerebr...

Putin Turns Russia’s AI Future Into a Kremlin Family Business
Bloomberg

Putin Turns Russia’s AI Future Into a Kremlin Family Business

Putin is consolidating Russia’s AI ambitions into a Kremlin family business, effectively turning the sector into a dynas...

Reuters

Meta repeatedly pushes back new AI model release for developers, WSJ says

Meta has repeatedly delayed the release of its new AI model for developers, according to the WSJ. This ongoing postponem...