arXiv

RIDE: An Open Dataset and Benchmark for Train Delay Prediction

Title: RIDE: A Comprehensive Open Dataset and Benchmarking Framework for Predicting Train Delays

Abstract:

Accurately forecasting train delays is critical for both commuters and railway management. However, the field has struggled to measure advancements effectively due to the absence of unified datasets, consistent prediction objectives, and standardized evaluation methods. To bridge this divide, we present RIDE, a large-scale, open-access dataset and benchmarking suite developed for the Belgian national railway network. Spanning the years 2023 through 2025, RIDE aggregates 94.5 million train events, 3.6 million journeys, and 35.7 million weather data points.

The resource is structured as a layered data pipeline that transforms raw inputs from railway and meteorological sources into two distinct public releases: a versatile intermediate relational dataset and specific benchmark datasets optimized for model training. This benchmarking suite standardizes the prediction task, defines the training and testing splits, and establishes a unified evaluation protocol to facilitate direct model comparisons.

Leveraging this framework, we conduct the first exhaustive comparative analysis of non-learning, statistical learning, and deep learning approaches. Our results indicate that learning-based methods significantly surpass non-learning models. Among these, graph neural networks delivered the highest average performance, though the top-performing learning-based models showed only marginal differences from one another. In addition to standard aggregate metrics such as Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE), the framework offers granular breakdowns by prediction horizon and delay magnitude, allowing for a deeper examination of model behavior across various forecasting scenarios.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

TechCrunch

Ramp raises $750M at $44B valuation as investors hunger for fintechs with an AI story

Ramp secured $750M at a $44B valuation, driven by AI integration and $1.5B+ revenue. The fintech firm now serves 70,000 ...

TechCrunch

Is Silicon Valley ready to put robots in people’s homes? Hello Robot is.

Hello Robot’s Stretch avoids Silicon Valley hype, focusing on practical home deployment to gather essential real-world d...

Canada to Provide Funding, Buy Equity Stakes in AI Startups
Bloomberg

Canada to Provide Funding, Buy Equity Stakes in AI Startups

Canada will fund and buy equity stakes in AI startups to boost the sector. This investment aims to strengthen the nation...

TechCrunch

Chinese spies are using LinkedIn to lure Westerners into sharing sensitive information

A joint Western security alert warns that Chinese spies use LinkedIn to impersonate recruiters and extract sensitive dat...

Peter Thiel’s Family Office Pays Record Rent for Top Miami Tower
Bloomberg

Peter Thiel’s Family Office Pays Record Rent for Top Miami Tower

Peter Thiel’s family office set a record rent for a Miami tower lease. This deal establishes a new benchmark for the cit...

Who’s Excited for SpaceX’s I.P.O.? Space Nerds.
New York Times

Who’s Excited for SpaceX’s I.P.O.? Space Nerds.

Space enthusiasts are the most eager for SpaceX’s IPO, driven by their passion for space exploration.