The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives
Title: The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives
Abstract:
Current legal, ethical, and regulatory frameworks for artificial intelligence operate on a foundational premise: whenever a significant outcome occurs, there exists at least one identifiable individual whose level of involvement and foresight was sufficient to assume meaningful responsibility. This study demonstrates that agentic AI systems fundamentally breach this premise, not due to technical constraints, but as a mathematical inevitability once their autonomy surpasses a specific, computable limit.
To address this, we formalize the concept of Human-Agent Collectives—joint systems comprising humans and AI—by modeling agents as state-policy tuples embedded within a shared structural causal model. We define autonomy via a four-dimensional information-theoretic profile encompassing epistemic, executive, evaluative, and social dimensions, while characterizing collective dynamics through interaction graphs and joint action spaces.
Legitimate accountability is axiomatized through four minimal criteria: Attributability (requiring causal contribution for responsibility), the Foreseeability Bound (limiting responsibility to predictive capacity), Non-Vacuity (ensuring at least one agent holds non-trivial responsibility), and Completeness (mandating that all responsibility is fully allocated). Our primary finding, the Accountability Incompleteness Theorem, establishes that for any collective where compound autonomy exceeds the Accountability Horizon and whose interaction graph includes a human-AI feedback loop, it is impossible to satisfy all four properties simultaneously.
This impossibility is structural; measures such as transparency, audits, and oversight cannot resolve the issue without curtailing autonomy. Below the defined threshold, however, legitimate frameworks do exist, indicating a sharp phase transition. Experiments conducted on 3,000 synthetic collectives validated all predictions with zero observed violations. As the first impossibility result in the field of AI governance, this work delineates a formal boundary: below it, existing paradigms remain valid, while above it, distributed accountability mechanisms become essential.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




