arXiv

Towards Multidisciplinary Summarization of Hospital Stays: Efficient Sentence-Level Clinical Provenance Categorization

June 2, 2026 · Baris Karacan, Vaibhav Bhargava, Barbara Di Eugenio, Natalie Parde, Mary Khetani, Yu-Shan Tseng, Vanessa Barbosa, Julie Vignato, Lindsey Knake, Rajashree Dahal, Emily Spellman, Danielle Hitzel, Janine Petitgout, Kristi Haughey, Amanda Karstens, Brianna Cl · Original Source

Title: Advancing Multidisciplinary Summarization of Hospital Stays: Streamlined Sentence-Level Clinical Provenance Categorization

Abstract: Achieving comprehensive "all-team" summaries in high-stakes environments such as the Neonatal Intensive Care Unit (NICU) demands the synthesis of perspectives from a wide array of disciplines—including physicians, nurses, and therapists—drawn from hundreds of clinical free-text records. However, merely aggregating this heterogeneous text frequently results in disjointed outcomes. Consequently, effective structured summarization hinges on the precise classification of sentence-level provenance across these multi-source documents. This pilot study presents a pipeline for clinical provenance categorization leveraging supervised fine-tuning (SFT) of large language models (LLMs). We tailored two Llama-3 variants (8B and 70B) using MedSecId, a dataset comprising 2,002 MIMIC-III (Adult ICU) notes annotated with clinical provenance headers, securing in-domain Macro F1 scores exceeding 92% for both architectures. To test cross-domain generalization, we examined the impact of model capacity and quantization on a gold-standard dataset containing 227 sentence-level spans extracted from three multidisciplinary NICU summaries. The results reveal a scale-dependent transfer phenomenon: while SFT yielded only minor improvements for the 8B model, it drove a significant 7% increase in Macro F1 for the 70B model. Intriguingly, the quantized, fine-tuned 70B model surpassed its full-precision counterpart while drastically lowering computational demands. These insights indicate that adequate model capacity is essential for maintaining semantic adaptability during cross-domain clinical transfers, and that efficient quantized adaptation can facilitate structured provenance modeling for subsequent summarization tasks.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC