Self-Reflective APIs: Structure Beats Verbosity for AI Agent Recovery
Title: Self-Reflective APIs: Structural Clarity Outperforms Verbose Explanations for AI Agent Restoration
Abstract:
When an AI agent encounters a validation error during an API call, it requires actionable guidance rather than merely an explanation of the failure. This study introduces the concept of a self-reflective API, which provides a machine-readable recovery_feedback.suggestions[] payload upon validation failure. This structured output enables agents to autonomously repair requests and retry without relying on external reasoning processes.
In a pilot study that underwent rigorous leak auditing—featuring 30 participants per cell, three large language models (LLMs), and 10 adversarial tasks—structured suggestions significantly improved task completion rates. Specifically, Anthropic models demonstrated an increase of 36.7 to 40.0 percentage points (pp) compared to plain-English diagnoses, a result that was statistically significant (Fisher's exact $p \le 0.0022$). Furthermore, these structured responses yielded 1.8 to 2.2 times greater per-success token efficiency. However, this performance boost was not statistically significant for the gpt-4o-mini model ($p=0.435$). The observed pattern was further validated through a replication study in a different domain using a billing API.
Crucially, these comparative findings are only valid after auditing two previously undocumented classes of answer leakage within LLM benchmarks. To support transparency and reproducibility, the authors provide shipaudit_prompt_leakage.py as reusable continuous integration (CI) infrastructure.
Code and Data: https://github.com/arquicanedo/self-reflective-apis
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






