arXiv

RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

June 2, 2026 · Yilang Zhang, Bingcong Li, Georgios B. Giannakis · Original Source

Title: RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

Abstract:

Low-Rank Adaptation (LoRA) reduces the memory footprint and computational costs associated with fine-tuning large-scale models by modifying only a low-dimensional subspace of the pre-trained weight matrices. Despite its efficiency, LoRA often suffers from subpar convergence rates and significant performance drops. These issues stem from inconsistent and unbalanced weight updates caused by the non-uniqueness of low-rank factorizations. To address these challenges, this study proposes identifying the optimal low-rank factorization at each step to minimize an upper bound on the loss function. This approach yields the Refactored Low-Rank Adaptation (RefLoRA) method, which ensures consistent and balanced weight updates while fostering a flatter loss landscape, thereby accelerating stable convergence. We conducted extensive experiments on natural language understanding and commonsense reasoning tasks using prominent large language models such as DeBERTaV3, LLaMA-7B, LLaMA2-7B, and LLaMA3-8B. Our numerical results demonstrate that RefLoRA achieves faster convergence and superior performance across various benchmarks, all while incurring negligible additional computational overhead compared to existing state-of-the-art LoRA variants.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC