arXiv

From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

Title: From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

Abstract:

This technical report, currently in beta, investigates the optimal representation of reusable experience to serve as both effective test-time control and a foundation for iterative evolution. To address this, we conducted 4,590 controlled trials spanning 45 distinct scientific code-solving scenarios. Our analysis reveals that documentation-oriented Skill packages offer inconsistent control; their actionable signals are sparse, and inflating a concise experience object into a comprehensive documentation package typically fails to provide benefits, often reducing overall performance averages.

Furthermore, we demonstrate that the method of representation is a primary determinant of success. A compact "Gene" representation achieves the highest overall average, maintains competitiveness even under significant structural perturbations, and surpasses Skill fragments with equivalent budgets. Conversely, adding documentation-oriented content to Genes generally diminishes rather than enhances their utility.

Beyond one-shot control, our findings indicate that Genes are superior carriers for accumulating experience iteratively. Failure history integrated into Genes proves more effective than when attached to Skills or freeform text. Additionally, the editable structure of the experience object is critical, with failure information yielding the greatest benefit when distilled into concise warnings rather than being naively appended. On the CritPt benchmark, gene-evolved systems demonstrated improvements over their paired base models, rising from 9.1% to 18.57% and from 17.7% to 27.14%. These outcomes suggest that the fundamental challenge in experience reuse lies not in providing greater volumes of data, but in encoding experience as a compact, control-oriented object ready for evolution.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...