From Features to Actions: Explainability in Traditional and Agentic AI Systems
Title: Moving Beyond Features: A Comparative Analysis of Explainability in Static and Agentic AI
Abstract:
For the past ten years, the field of Explainable AI (XAI) has predominantly concentrated on deciphering individual model predictions. This approach typically generates post-hoc explanations that map inputs to outputs within a static decision framework. However, the emergence of large language models (LLMs) has facilitated the development of agentic AI systems, where behavior evolves through multi-step trajectories. In these dynamic environments, outcomes are dictated by sequences of decisions rather than isolated outputs, raising questions about how traditional explanation methods, designed for static predictions, can be adapted to settings where behavior unfolds over time.
This study addresses this gap by evaluating attribution-based explanations against trace-based diagnostics in both static and agentic contexts. Our analysis reveals a distinct divergence in performance: while attribution techniques provide stable feature rankings in static scenarios (Spearman Ļ = 0.86), they prove unreliable for diagnosing execution-level failures within agentic trajectories. Conversely, trace-grounded rubric evaluation effectively pinpoints behavioral breakdowns in agentic systems. Notably, this method highlights that state tracking inconsistencies occur 2.7 times more frequently in failed runs and are associated with a 49% reduction in success probability. These insights underscore the necessity of adopting trajectory-level explainability to properly evaluate and diagnose the autonomous behaviors of agentic AI.
Code: https://github.com/VectorInstitute/unified-xai-evaluation-framework Project page: https://vectorinstitute.github.io/unified-xai-evaluation-framework
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




