arXiv

Self-Reflective APIs: Structure Beats Verbosity for AI Agent Recovery

Title: Self-Reflective APIs: Structural Clarity Outperforms Verbose Explanations for AI Agent Restoration

Abstract:

When an AI agent encounters a validation error during an API call, it requires actionable guidance rather than merely an explanation of the failure. This study introduces the concept of a self-reflective API, which provides a machine-readable recovery_feedback.suggestions[] payload upon validation failure. This structured output enables agents to autonomously repair requests and retry without relying on external reasoning processes.

In a pilot study that underwent rigorous leak auditing—featuring 30 participants per cell, three large language models (LLMs), and 10 adversarial tasks—structured suggestions significantly improved task completion rates. Specifically, Anthropic models demonstrated an increase of 36.7 to 40.0 percentage points (pp) compared to plain-English diagnoses, a result that was statistically significant (Fisher's exact $p \le 0.0022$). Furthermore, these structured responses yielded 1.8 to 2.2 times greater per-success token efficiency. However, this performance boost was not statistically significant for the gpt-4o-mini model ($p=0.435$). The observed pattern was further validated through a replication study in a different domain using a billing API.

Crucially, these comparative findings are only valid after auditing two previously undocumented classes of answer leakage within LLM benchmarks. To support transparency and reproducibility, the authors provide shipaudit_prompt_leakage.py as reusable continuous integration (CI) infrastructure.

Code and Data: https://github.com/arquicanedo/self-reflective-apis


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role
Bloomberg

Revolut Co-Founder, CTO Vlad Yatsenko to Step Down From Role

Revolut co-founder and CTO Vlad Yatsenko is stepping down from his executive role. The resignation marks a significant l...

Microsoft’s AI Chief Says Anthropic Models Are Too Expensive
Bloomberg

Microsoft’s AI Chief Says Anthropic Models Are Too Expensive

Microsoft AI CEO Mustafa Suleyman criticized Anthropic’s models as too expensive. Meanwhile, Microsoft plans to allow us...

Ramp Notches $44 Billion Valuation in New Funding Round
Bloomberg

Ramp Notches $44 Billion Valuation in New Funding Round

RAMP secured a $44 billion valuation in its latest funding round. CEO Eric Glyman attended the 2026 Reagan National Econ...

China’s Robotaxi Dilemma Shows AI Policy Tension Between Growth and Jobs
Bloomberg

China’s Robotaxi Dilemma Shows AI Policy Tension Between Growth and Jobs

China’s robotaxi expansion highlights the policy tension between driving economic growth through AI and protecting emplo...

Exams watchdog warns of rise in high-tech cheating
BBC News

Exams watchdog warns of rise in high-tech cheating

Ofqual warns of rising high-tech cheating, with smart devices involved in 44% of misconduct cases. Invigilators are trai...

Thailand’s Richest Man Plans $4.3 Billion Expansion Amid AI Boom
Bloomberg

Thailand’s Richest Man Plans $4.3 Billion Expansion Amid AI Boom

Thailand’s wealthiest individual is investing $4.3 billion in expansion, capitalizing on the booming artificial intellig...