FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models
Title: FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models
Abstract:
Aligning Large Language Models (LLMs) with human preferences typically requires navigating the tension between competing goals, such as ensuring helpfulness while maintaining harmlessness. This training process is not only computationally demanding but also raises substantial data privacy issues when centralized. While Federated Learning (FL) presents a viable solution, current Federated Multi-Objective Optimization (FMOO) techniques are hindered by significant communication bottlenecks; their dependence on sending multiple gradients to a central server does not scale effectively for large models.
To address these challenges, we present FIRM (Federated In-client Regularized Multi-objective alignment), a new algorithm designed to enhance communication efficiency while mitigating client disagreement drift. FIRM operates by having each client resolve a regularized multi-objective optimization problem locally. This approach removes the necessity for the multi-gradient transmissions characteristic of previous methods, as in-client regularization directly addresses client disagreement drift. As a result, clients are required to send only one set of adapted parameters, thereby preserving high communication efficiency.
We demonstrate that our algorithm converges to Pareto-stationary points and, to the best of our knowledge, offer the first finite-time convergence guarantees within this specific federated multi-objective alignment context. Our empirical results indicate that FIRM yields smoother training dynamics, less client disagreement drift, and better reward trade-offs relative to baseline methods. Additionally, we introduce a technique to embed preferences among objectives, supported by empirical Pareto plots that illustrate FIRM’s ability to smoothly adjust objective trade-offs in accordance with specified preferences.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





