arXiv

Learning What Not to Impute: An Uncertainty-Aware Diffusion Framework for Meaningful Missingness

Title: Identifying Meaningful Absences: An Uncertainty-Driven Diffusion Approach for Selective Imputation

Abstract:

Addressing missing values is a cornerstone of machine learning, yet conventional techniques typically operate under the assumption that every absent data point represents an unobserved instance of a standard value. This perspective overlooks the reality that missingness in practical datasets often stems from two fundamentally different causes: certain entries are genuinely absent and semantically appropriate (meaningfully missing), while others are lost due to observational limitations and require reconstruction. To tackle this complexity, we define the problem as selective imputation, aiming to simultaneously determine which gaps should remain intact and which ought to be filled. In response, we introduce Diff-Joint, a novel framework leveraging diffusion models to co-model tabular data and a latent mask representing missingness patterns. The algorithm refines both the imputed values and the classification of missingness through an iterative process that alternates between conditional sampling and uncertainty-aware aggregation. Our experiments on both synthetic and real-world benchmarks confirm that Diff-Joint successfully distinguishes meaningfully missing entries, delivering competitive accuracy in imputation and enhancing performance in subsequent downstream tasks.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Glazer Family Members Said to Study Manchester United Stake Sale
Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines
Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion
Bloomberg

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion

Nvidia-backed robotics startup Generalist AI has reached a $2 billion valuation. Founders Pete Florence, Andy Zeng, and ...

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...

Financial Times

How AI has de-skilled translation

AI fragments specialist translation into routine tasks, effectively de-skilling the profession. This shift reduces compl...

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...