arXiv

What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA

June 4, 2026 · Jonas Petersen, Camilla Mazzoleni, Gian-Alessandro Lombardi, Federico Martelli, Riccardo Maggioni · Original Source

Title: Identifying the Structural Inductive Bias That Enables Transformers to Reason on Knowledge Graphs: A Tabula RASA Study

Abstract:

Which structural inductive bias is most critical for enabling transformers to perform reasoning tasks on knowledge graphs? By conducting controlled ablation studies on a streamlined transformer architecture featuring four distinct, independently removable components—namely sparse adjacency masking, edge-type biases, query scaling, and value gating—we determine which structural signals are responsible for effective multi-hop reasoning. Our results reveal a clear distinction: sparse adjacency masking is the primary driver of performance gains, accounting for the majority of the improvement over standard unmasked transformers. Specifically, this technique yields improvements of +72.5 percentage points on 3-hop MetaQA, +45.5pp on WebQSP, and +53.9pp on CWQ. In contrast, learned relation parameters provide only marginal refinements and can actually degrade performance if not supported by structural guidance. This conclusion is further supported by zero-shot experiments, which show that attention mechanisms based on masking degrade 4.0 times less than relation-specific weights when edge types are excluded from the test set. Ultimately, our findings suggest that the essential inductive bias for multi-hop Knowledge Graph Question Answering (KGQA) is predominantly topological rather than relational.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC