Multi-Modal Machine Learning for Breast Cancer Recurrence Prediction
Title: Leveraging Multi-Modal Machine Learning to Forecast Breast Cancer Recurrence
Abstract:
Timely and precise risk evaluation is essential for directing follow-up care and treatment strategies, particularly because breast cancer recurrence remains a primary driver of long-term mortality among survivors. Conventional predictive algorithms frequently fall short by relying exclusively on either structured datasets or unstructured text, thereby failing to encompass the complete clinical picture. This research investigates how combining various clinical data sources—such as treatment histories, pathology documentation, and physician notes—can enhance recurrence forecasting.
To address data fragmentation, the proposed methodology employs a rule-based regular expression extraction system paired with a precedence-driven conflict reconciliation protocol. This framework successfully extracts definitive tumor attributes from unstructured pathology narratives, thereby supplementing existing structured records. Furthermore, the study benchmarks these findings against standard feature sets utilized in previous breast cancer research to quantify the specific benefits of multi-modal integration.
The performance of both single-source and multi-modal inputs was tested across multiple machine learning architectures. The findings indicate that incorporating multi-modal data yields a consistent and significant improvement in predictive accuracy when compared to methods relying on a single data type.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



