arXiv

Canonicalized Stable-List Replay for Private Federated Continual Learning over Language-Model Embeddings

Title: Canonicalized Stable-List Replay for Private Federated Continual Learning over Language-Model Embeddings

Abstract:

Federated continual learning (FCL) enables distributed clients to adapt language-model heads to evolving NLP tasks without exchanging raw text data. However, when user-level differential privacy (DP) is applied, replay-based continual learning encounters a structural barrier: clients are restricted to releasing only small, noisy lists of candidate replay summaries, which remain unordered across different clients. To address this, we propose Canonicalized Stable-List Replay (CSLR). In this framework, clients privately generate candidate replay distributions within a shared sentence-embedding space, and the server aligns these distributions using signatures derived from public anchor sentences. Crucially, these anchors serve to provide identifiability for aggregation purposes rather than functioning as additional replay data. We demonstrate that, provided an observable anchor-signature margin exists, $O(\log(N/\eta)/p)$ anchors are sufficient to distinguish $N$ candidate list elements with a probability of at least $1-\eta$. Additionally, we present a scoped result demonstrating non-identifiability for unordered-label oracle models in the absence of anchors.

Empirical evaluations across five seeds on continual classification, Named Entity Recognition (NER), and dialogue benchmarks reveal that CSLR enhances the final average task metric by 3.9 to 5.6 points compared to the most effective non-CSLR DP baseline at $\eps=4$, given the specified replay-release budget. Furthermore, CSLR surpasses both Hungarian and optimal-transport matchers. The formal privacy guarantee encompasses the replay release process; however, achieving end-to-end private training necessitates composing this with a private optimizer for updates to the task heads.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...