GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning
Title: GuidaPA: Enabling Privacy-Preserving Public Administration Chatbots Through Federated Learning
Abstract
This paper introduces GuidaPA, a specialized chatbot designed for the Italian Public Administration (PA) that leverages Federated Learning (FL) to ensure data privacy. The model was trained using documentation sourced from two national PA platforms: SIGESON and SIDFORS. Specifically, the training corpus comprised roughly 8 pages of SIGESON manuals and 31 pages of SIDFORS manuals and FAQs. While this research utilizes publicly available documents as a secure proxy, the system is intended for deployment in environments involving restricted internal resourcesâsuch as ticket records, officer guidelines, and database extractsâwhich cannot be centrally aggregated due to organizational and regulatory limitations.
GuidaPA incorporates several key technical components, including role-based access control, secure preprocessing on the client side, explicit monitoring of non-IID data distribution effects, and parameter-efficient federated fine-tuning of large language models. The evaluation methodology employed QLoRA (4-bit quantization) across 15 federated rounds, with each client utilizing an 80/20 split for training and testing. Answer quality was assessed using ROUGE, BLEU-4, and METEOR metrics.
The results demonstrate that the top-performing federated model achieved a ROUGE-1/2/L score of 61.10/55.77/59.44, a BLEU-4 score of 45.02, and a METEOR score of 63.94. These figures are comparable to those obtained from private centralized fine-tuning, yet they maintain data locality. Furthermore, domain-specific fine-tuning significantly outperformed the general-purpose baseline, raising ROUGE-1 scores from 41.45 to 62.18 and BLEU-4 scores from 26.97 to 50.90. Ultimately, these findings suggest that Federated Learning can successfully provide high-quality conversational AI for public services without the need for centralized data sharing.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




