Geometry-Aware Tabular Diffusion
Geometry-Aware Tabular Diffusion
Abstract
While tabular synthesis is essential for enabling privacy-preserving data sharing and dataset augmentation, existing diffusion models typically depend on implicit mechanisms to model relationships between columns. To address this, we present Geometry-Aware Tabular Diffusion (GATD), a novel approach that enhances tabular diffusion denoisers by incorporating pairwise angles and lengths derived from differences in column values. These geometric features serve dual purposes: they are fed directly into the model as inputs and also utilized as auxiliary targets.
Our implementation using a Multi-Layer Perceptron (MLP) sets a new state-of-the-art in benchmark performance, achieving this with an average reduction of 3.5x in parameter count (and up to 25x fewer parameters for classification tasks). Across ten distinct datasets, GATD secured top performance in 8 out of 10 Shape metrics, 7 out of 10 Trend metrics, and 9 out of 10 downstream utility measures (evaluated via F1 and RMSE). Furthermore, it lowered Shape and Trend errors by 27% and 20%, respectively.
The efficacy of default loss weights in this framework extends beyond MLPs; when applied to Graph Neural Network (GNN) and Transformer denoisers, these weights improved Shape performance in 27 out of 30 architecture-dataset combinations and Trend performance in 25 out of 30. An ablation study with matched variables confirms that the performance gains are driven by supervision rather than additional input features or increased model capacity. These findings demonstrate that explicit relational supervision acts as a highly portable inductive bias for tabular diffusion models.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



