Adalina: Adaptive Linear Approximation for the Shapley Value and Beyond
Title: Adalina: Adaptive Linear Approximation for the Shapley Value and Beyond
Abstract:
Semi-values, including the widely recognized Shapley value, have garnered significant interest across numerous attribution tasks. A persistent and fundamental obstacle in this domain is the efficient estimation of these values, as precise calculation typically demands an exponential number of utility queries relative to the number of players, $n$. To address the demands of large-scale scenarios, this study investigates the boundaries of efficiently approximating semi-values while adhering to a $\Theta(n)$ space limitation. Leveraging a vector concentration inequality, we construct a theoretical foundation that facilitates more refined query complexities for existing unbiased randomized methods.
Within this framework, we devise a linear-space algorithm that necessitates $O(\frac{n}{\epsilon^{2}}\log\frac{1}{\delta})$ utility queries to guarantee that $P(|\hat{\boldsymbol\phi}-\boldsymbol\phi|{2}\geq\epsilon)\leq \delta$ holds for a broad range of standard semi-values. Notably, our approach unifies OFA, unbiased kernelSHAP, SHAP-IQ, and regression-adjusted techniques, while clearly delineating the conditions under which paired sampling offers advantages. Furthermore, the proposed method enables the explicit reduction of mean squared error, $\mathbb{E}[|\hat{\boldsymbol\phi}-\boldsymbol\phi|{2}^{2}]$, tailored to specific utility functions. Consequently, we present Adalina, the inaugural adaptive, linear-time, and linear-space randomized algorithm that demonstrates theoretically superior mean squared error performance. We substantiate our theoretical results through experimental validation. The source code is accessible at https://github.com/watml/adalina.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC



