arXiv

Early Prediction of Liver Cirrhosis Up to Two Years in Advance: A Machine Learning Study Benchmarking Against the FIB-4 and APRI Scores

Title: Machine Learning Surpasses Traditional Scores in Forecasting Liver Cirrhosis Two Years Before Diagnosis

Abstract:

Objective: The primary goal of this study was to create and assess machine learning (ML) algorithms capable of forecasting the onset of liver cirrhosis (LC) one and two years before clinical diagnosis. This was achieved using standard electronic health record (EHR) data, with the models’ efficacy benchmarked against established clinical metrics, specifically the FIB-4 and APRI scores.

Methods: Utilizing de-identified EHR data from a major academic medical center, we performed a retrospective cohort analysis. We engineered XGBoost models tailored for 1-year and 2-year prediction windows. To optimize predictive accuracy, we employed specific feature selection techniques alongside Bayesian hyperparameter tuning. The resulting models were validated on separate test sets. Their performance was rigorously compared against FIB-4 and APRI scores using several metrics: accuracy, precision, recall, F1 score, area under the precision-recall curve (PR AUC), and area under the receiver operating characteristic curve (AUC).

Results: The final datasets comprised 60,481 patients for the 1-year prediction cohort and 47,322 patients for the 2-year cohort. In both timeframes, the optimized ML models demonstrated superior performance compared to both FIB-4 and APRI. Specifically, the XGBoost models yielded AUCs of 0.872 and 0.839 for 1- and 2-year predictions, respectively. In contrast, FIB-4 achieved AUCs of 0.756 and 0.723, while APRI reached 0.798 and 0.761. The disparity was even more pronounced in the precision-recall analysis, where XGBoost recorded PR AUCs of 0.657 and 0.562, significantly higher than FIB-4’s 0.456 and 0.373, and APRI’s 0.504 and 0.421. Notably, the advantage of the ML models grew as the prediction horizon extended, suggesting sustained ability to discriminate risk early on.

Conclusions: Leveraging routine EHR data, machine learning models offer a substantial improvement over traditional FIB-4 and APRI scores for the early detection of liver cirrhosis. These advanced tools facilitate more precise and timely risk stratification. By integrating these models into clinical practice as automated decision-support systems, healthcare providers can enhance proactive prevention strategies and improve the management of cirrhosis.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...