arXiv

Speedrunning Tabular Foundation Model Pretraining

June 3, 2026 · Salih Bora Ozturk, Alexander Pfefferle, Frank Hutter · Original Source

Title: Accelerating Tabular Foundation Model Pretraining Through Community Speedruns

The high computational expense of pretraining serves as a significant hurdle for advancements in tabular foundation models, impeding the rapid testing of novel architectures, priors, and optimization strategies. Furthermore, the research community currently lacks a standardized method for comparing and building upon these efficiency gains. To address this, we have launched a community-driven speedrun initiative for nanoTabPFN. Participants are invited to optimize a single-file training script, competing to achieve a specific downstream ROC AUC benchmark on a subsampled TabArena dataset, all within the constraints of a single NVIDIA L40S GPU.

The current top performance achieves the target metric in just 0.92 minutes. This represents an 81-fold acceleration compared to the baseline time of 74.32 minutes, while simultaneously reducing the number of synthetic datasets required by a factor of 22. This speedrun framework establishes a clear protocol for researchers to contribute, validate, and layer pretraining enhancements, with the leaderboard remaining open for new submissions. All relevant code and performance records can be accessed at https://github.com/borawhocodess/modded-nanotabpfn.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC