Multiple Choice Learning of Low-Rank Adapters for Language Modeling
Title: Multiple Choice Learning of Low-Rank Adapters for Language Modeling
Abstract:
We introduce LoRA-MCL, a novel training framework that enhances the standard next-token prediction mechanism in language models by enabling the generation of varied and credible sentence continuations during inference. Because language modeling is inherently an ill-posed challenge—where a single context can logically lead to several equally valid future sequences—our method addresses this ambiguity by integrating Multiple Choice Learning (MCL) with Low-Rank Adaptation, utilizing a winner-takes-all loss function for efficient handling. We offer a theoretical analysis of this approach, positing that the underlying data follows a mixture of distributions, and validate this concept using mixtures of Markov chains. Experimental results across tasks such as machine translation, as well as audio and visual captioning, confirm that our technique produces outputs that are both highly diverse and contextually relevant. The codebase for implementing LoRA-MCL across various language models is now publicly available.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



