RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases
Title: RelGT-AC: Leveraging Relational Graph Transformers for Autocomplete in Relational Databases
Abstract:
While relational databases form the backbone of contemporary enterprise, scientific, and healthcare infrastructures, applying predictive machine learning to them is complex. This difficulty stems from their inherent multi-table, heterogeneous, and temporal nature. Relational Deep Learning (RDL) offers a solution by modeling databases as heterogeneous graphs and deploying graph neural networks (GNNs) directly on these structures. With the release of RelBench v2, a new category of practically relevant tasks, known as autocomplete, has been introduced. These tasks mimic intelligent form-filling assistants, aiming to predict an existing column value based on relational context.
To address these challenges, we introduce RelGT-AC (Relational Graph Transformer for Autocomplete). This model extends the existing RelGT architecture through three key innovations: first, a column masking technique that inhibits trivial solutions by hiding the target column during subgraph encoding; second, a versatile task head capable of handling binary classification, multiclass classification, and regression autocomplete tasks within a unified framework; and third, a TF-IDF text encoder designed to automatically identify and encode free-text columns, thereby capturing strong lexical signals that traditional categorical encoders often overlook.
In evaluations across seven tasks involving three RelBench v2 datasets (rel-trial, rel-f1, and rel-stack), RelGT-AC demonstrated superior performance. It surpassed the GraphSAGE baseline on all three regression autocomplete tasks and improved performance by up to 10 AUROC points on text-heavy eligibility tasks, thanks to the integration of the TF-IDF encoder.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



