arXiv

A Framework for Graph-Conditioned Hierarchical Shapley Attribution in Patent Valuation

June 2, 2026 · Joy Bose · Original Source

Title: A Framework for Graph-Conditioned Hierarchical Shapley Attribution in Patent Valuation

Abstract

Determining the specific economic value of an individual patent within a product that incorporates tens of thousands of patents remains a persistent challenge in intellectual property economics. To address this, we introduce PatentXAI, a framework that reimagines patent valuation through the lens of explainable AI. By defining a characteristic function $v(S)$ that represents the revenue potential of a patent subset $S$, the framework calculates a patent’s Shapley value to determine its equitable share of product profits, ensuring adherence to the principles of efficiency, symmetry, dummy, and additivity.

To render the computation feasible, we limit each patent’s coalition to its Markov Blanket within a knowledge graph, leveraging the C-SVE conditional independence theorem (Li et al., 2020). Our scaling experiments, which range from 12 to 100 patents using Pareto-distributed coverage graphs, reveal that the median Markov Blanket size constitutes 32.9% of $n$ at the $n=100$ mark, with the 90th percentile reaching 55.2% of $n$. Computationally, this approach requires only 10 milliseconds per patent. The deviation from the exact ground truth is 0.088 at $n=12$, while the difference compared to a high-sample Monte Carlo reference at $n=100$ is 0.062 ± 0.003.

In experiments involving dense components, where 80% of patents belong to a single cluster, the blanket appropriately expands to encompass this dense area. This results in a reduced difference of 0.039 against the reference, as the aggregated computation yields higher accuracy for homogeneous portfolios. The profit allocation process is hierarchical: exact Shapley values are first used to distribute total profit among macro-components, followed by a centrality-weighted Shapley distribution that allocates each component’s budget among the patents it covers.

While our primary focus is on computational contribution, estimating the characteristic function $v(S)$ from real-world data remains the key open problem. We delineate this distinction from our computational method and propose a concrete roadmap for empirical validation utilizing public datasets from ETSI, USPTO, and Lens.org.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC