Spatiotemporal Multi-Task Graph Transformer for Trip-Level Transit Prediction
Title: SMT-GraphFormer: A Spatiotemporal Multi-Task Graph Transformer for Trip-Level Transit Prediction
Abstract:
Analyzing passenger volume data from public transportation networks offers critical insights into urban mobility trends, serving as a foundational element for strategic planning, operational management, and system optimization. Nevertheless, accurately modeling and forecasting these patterns is difficult due to the complex, non-linear spatiotemporal relationships that exist between various stops and transit lines. Current methodologies frequently depend on static temporal, spatial, or stop-centric frameworks, which restricts their capacity to account for the evolution of trips and their broader network context.
To address these limitations, this research introduces SMT-GraphFormer, a novel spatiotemporal multi-task graph transformer. This model reimagines trip-level transit prediction through a sequence-to-sequence framework. By utilizing a line’s specific stop sequence alongside trip-level contextual data, the system forecasts subsequent numbers of passengers boarding and alighting. Additionally, delay and dwell time are incorporated as surrogate tasks on the encoder side. The architecture features several core elements: graph embeddings designed to capture multi-relational similarities between stops, a context encoder that processes temporal and weather-related data, and a multi-gate mixture-of-experts module. This module generates decoder representations tailored to specific tasks, facilitating distinct predictions for boarding and alighting volumes.
Testing conducted on public bus transit datasets from Trondheim, Norway, indicates that SMT-GraphFormer surpasses stop-level tabular benchmarks. Ablation studies were performed to isolate and evaluate the impact of each individual component. The sequential approach delivered significant performance boosts for alighting predictions, improving the $R^2$ score by 0.24, while also providing steady enhancements in forecasting boarding volumes, delays, and dwell times. These results highlight the effectiveness of incorporating explicit trip-level sequential biases and inter-target dependencies.
Overall, the study illustrates the capability of transformer-based sequence modeling to handle intricate spatiotemporal dynamics within public transit systems. It emphasizes the importance of utilizing architectures specifically designed for transit data, rather than relying on generic tabular models. Furthermore, the proposed framework establishes a flexible, horizon-agnostic foundation for scenario analysis within digital twin ecosystems, thereby aiding planners and transit operators in making well-informed decisions.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





