arXiv

Do Real-World Datasets Contain Natural Experiments? An Empirical Study Using Causal Feature Selection

Title: Investigating the Presence of Natural Experiments in Real-World Data: An Empirical Analysis via Causal Feature Selection

Natural experiments are defined as real-world events that act as implicit interventions, impacting specific individuals or groups while leaving others unaffected. A prime example is the coronavirus pandemic, which served as an intervention by the virus on the subset of the population it infected. This study investigates whether such natural experiments exist within current real-world datasets and explores the appropriate methods for handling them.

To identify these phenomena, we employ causal discovery techniques to reconstruct the underlying causal graph and subsequently conduct feature selection based on identified causal relationships. The core hypothesis is that if modeling the data as interventional—rather than observational—leads to enhanced downstream performance, it indicates the presence of natural experiments within the dataset.

We initially tested this hypothesis by generating synthetic datasets, both with and without embedded natural experiments, using constructed graphs. Following this validation, we executed a comprehensive empirical assessment across a broad range of real-world datasets. The findings confirm that natural experiments are indeed present in real-world data. Furthermore, leveraging these experiments through causal inference methods can significantly boost model performance. This research marks a preliminary step into this domain, providing an initial exploration within a constrained scope.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...