arXiv

Generalizable Multi-Task Learning for Wireless Networks Using Prompt Decision Transformers

June 4, 2026 · Fatih Temiz, Shavbo Salehi, Melike Erol-Kantarci · Original Source

Title: Enabling Generalizable Multi-Task Learning in Wireless Networks via Prompt Decision Transformers

Abstract:

The evolution of future wireless networks requires swift adaptation to dynamic task setups and highly heterogeneous environments. This necessity drives a transition from traditional, rule-based, and optimization-centric Radio Resource Management (RRM) systems to those powered by Artificial Intelligence (AI). AI-driven RRM solutions offer the ability to grasp complex nonlinear interactions, generalize across varied network states, and facilitate real-time, autonomous, and scalable decision-making.

Within RRM strategies, Coordinated Multipoint (CoMP) transmission plays a critical role in reducing inter-cell interference and boosting performance at cell edges, which ultimately enhances the Quality of Experience (QoE) in densely populated deployments. Nevertheless, determining the optimal multi-cell selection remains a formidable combinatorial problem. It involves jointly optimizing numerous potential serving-cell combinations amidst fluctuating channel conditions and traffic loads.

While conventional Deep Reinforcement Learning (DRL) methods, such as Proximal Policy Optimization (PPO), have shown success, they are hindered by low sample efficiency, restricted generalization capabilities, and expensive retraining requirements when state and action spaces shift. To overcome these limitations, this study introduces a multi-task learning framework centered on the Prompt Decision Transformer (PromptDT). This approach transforms the multi-cell selection process into a sequence modeling task and facilitates learning across a wide array of network configurations.

By utilizing offline trajectories alongside task-specific prompts, PromptDT supports scalable learning across diverse setups, accommodating variations in the number of base stations and user equipment, as well as different scheduler policies. Experimental data indicates that, in multi-task scenarios, PromptDT enhances QoE by as much as 49% relative to baseline methods, with performance improvements correlating positively with increased model capacity. Furthermore, PromptDT demonstrates strong generalization to previously unseen tasks, allowing for robust few-shot adaptation to new network configurations without the need for retraining or fine-tuning.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC