arXiv

Uncertainty-Calibrated Explainable Artificial Intelligence for Fetal Ultrasound Plane Classification: A Systematic Review

Title: Uncertainty-Calibrated Explainable Artificial Intelligence for Fetal Ultrasound Plane Classification: A Systematic Review

Abstract

Fetal ultrasound serves as the foundation of antenatal care, with the precise identification of a limited number of standard anatomical planes being essential for biometry, growth monitoring, and the detection of structural anomalies. While deep learning classifiers currently achieve accuracy levels comparable to or surpassing those of human experts on curated benchmarks, many of these models lack transparency and suffer from miscalibration. Consequently, clinicians are often deprived of the calibrated confidence scores and reliable explanations required for safe decision support.

In adherence to PRISMA 2020 guidelines, we conducted a systematic review of 78 studies published between January 1, 2015, and April 30, 2026. These studies focused on automated fetal plane classification integrated with either explainability techniques or predictive uncertainty quantification. The pooled balanced accuracy across six standard planes was found to be 0.93 (95% CI 0.91 to 0.95). However, the integration of reliability metrics remains limited: only 19 studies (24%) reported calibration measures, and just 14 (18%) addressed selective prediction.

To address these gaps, we introduce CALIB-XFUS, a 22-item reporting framework designed to operationalize calibration, explanation faithfulness, and fairness for regulated fetal ultrasound artificial intelligence. This framework covers six key domains: clinical task and indication for use; dataset provenance and representativeness; model and training pipeline; calibration and selective prediction; explanation faithfulness and clinician validation; and post-market surveillance. We contend that achieving uncertainty calibration, faithful explanations, and fairness auditing in fetal ultrasound AI is not only technically viable but also a regulatory expectation under the FDA’s Good Machine Learning Practice principles and the high-risk obligations outlined in the EU AI Act.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...