arXiv

Early Prediction of Liver Cirrhosis Up to Two Years in Advance: A Machine Learning Study Benchmarking Against the FIB-4 and APRI Scores

June 2, 2026 · Zhuqi Miao, Ahmed G Qasem, Sujan Ravi, Jason T. Cheng, Abdulaziz Ahmed, Courtney W. Houchen, Sumayah Abed, Dilorom Azimdjanovna Zuparova, Abdulaziz Ahmed · Original Source

Title: Machine Learning Surpasses Traditional Scores in Forecasting Liver Cirrhosis Two Years Before Diagnosis

Abstract:

Objective: The primary goal of this study was to create and assess machine learning (ML) algorithms capable of forecasting the onset of liver cirrhosis (LC) one and two years before clinical diagnosis. This was achieved using standard electronic health record (EHR) data, with the models’ efficacy benchmarked against established clinical metrics, specifically the FIB-4 and APRI scores.

Methods: Utilizing de-identified EHR data from a major academic medical center, we performed a retrospective cohort analysis. We engineered XGBoost models tailored for 1-year and 2-year prediction windows. To optimize predictive accuracy, we employed specific feature selection techniques alongside Bayesian hyperparameter tuning. The resulting models were validated on separate test sets. Their performance was rigorously compared against FIB-4 and APRI scores using several metrics: accuracy, precision, recall, F1 score, area under the precision-recall curve (PR AUC), and area under the receiver operating characteristic curve (AUC).

Results: The final datasets comprised 60,481 patients for the 1-year prediction cohort and 47,322 patients for the 2-year cohort. In both timeframes, the optimized ML models demonstrated superior performance compared to both FIB-4 and APRI. Specifically, the XGBoost models yielded AUCs of 0.872 and 0.839 for 1- and 2-year predictions, respectively. In contrast, FIB-4 achieved AUCs of 0.756 and 0.723, while APRI reached 0.798 and 0.761. The disparity was even more pronounced in the precision-recall analysis, where XGBoost recorded PR AUCs of 0.657 and 0.562, significantly higher than FIB-4’s 0.456 and 0.373, and APRI’s 0.504 and 0.421. Notably, the advantage of the ML models grew as the prediction horizon extended, suggesting sustained ability to discriminate risk early on.

Conclusions: Leveraging routine EHR data, machine learning models offer a substantial improvement over traditional FIB-4 and APRI scores for the early detection of liver cirrhosis. These advanced tools facilitate more precise and timely risk stratification. By integrating these models into clinical practice as automated decision-support systems, healthcare providers can enhance proactive prevention strategies and improve the management of cirrhosis.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC