arXiv

GIFT: Games as Informal Training for Generalizable LLMs

June 4, 2026 · Nuoyan Lyu, Bingbing Xu, Xueyun Tian, Weihao Meng, Yige Yuan, Yang Zhang, Zhiyong Huang, Tat-Seng Chua, Huawei Shen · Original Source

Title: GIFT: Leveraging Games as Informal Training for Generalizable LLMs

Abstract:

While Large Language Models (LLMs) have demonstrated exceptional proficiency in structured domains like code generation and mathematical reasoning, they continue to face challenges in broader competencies such as social intelligence, creativity, and planning. Drawing inspiration from human cognitive development—where both formal instruction and informal experiences contribute to intelligence—we propose integrating informal learning into LLM training frameworks. We utilize games as environments that provide feedback-driven signals without the need for manual annotation.

To encompass a wide spectrum of capabilities, including abstract reasoning, strategic planning, creative output, and social interaction, our approach merges traditional formal math tasks with three distinct game-based scenarios: Matrix Games, TicTacToe, and Who's the Spy. However, applying a unified Reinforcement Learning (RL) objective to this mixed dataset can obscure task-specific learning signals and lacks explicit mechanisms for coordinating gradients across different tasks.

To address these limitations, we introduce Coordinated Subtask Training (CST). This method substitutes the single, mixed update step with sequential, subtask-specific updates. This strategy effectively isolates heterogeneous RL signals while implicitly fostering coordination among the various subtasks. Our experiments on ability-oriented benchmarks reveal that informal learning through games boosts generalization capabilities beyond what formal training achieves in isolation. Furthermore, CST significantly enhances multi-task RL by maintaining strong in-domain performance on subtasks while simultaneously elevating broader general abilities. The associated code and data have been made publicly available.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC