Cross-lingual Self-Consistency for Multilingual Reasoning with Language Models
Title: Enhancing Multilingual Reasoning in Language Models Through Cross-lingual Self-Consistency
Abstract:
Although large language models (LLMs) have significantly broadened their linguistic scope, their sophisticated reasoning abilities remain predominantly restricted to high-resource languages such as English. To bridge this gap, we introduce an unsupervised Reinforcement Learning (RL) framework designed to bolster multilingual reasoning by enforcing cross-lingual self-consistency. This core principle dictates that a model must yield identical final answers when presented with equivalent problems across varying languages. Current techniques are hindered by limited multilingual reasoning datasets and exhibit poor generalization capabilities when encountering unseen languages. In contrast, our proposed method operates without the need for parallel data or ground-truth answers. It delivers an average performance increase of up to 21.7% on the MGSM benchmark across ten languages. Furthermore, the approach showcases robust generalization, achieving a mean improvement of 18.2% on MGSM languages that were not included in the training set, alongside gains of up to 6.2% on three out-of-distribution benchmarks. These findings highlight the efficacy of consistency-driven strategies in advancing the multilingual reasoning proficiency of LLMs without relying on supervised data.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





