arXiv

Testing Most Influential Sets

Title: Evaluating the Extremes of Influence

Small subsets of data can wield disproportionate power over model outcomes, with just a handful of observations capable of reversing significant conclusions. Although recent research has focused on identifying these highly influential groups, a formal method for distinguishing between excessive influence and the natural variation expected from random sampling has been lacking. To bridge this gap, we introduce a rigorous framework for analyzing the most influential sets.

By concentrating on linear least-squares regression, we derive a precise formula for influence and characterize the extreme value distributions associated with maximal influence. Our findings indicate that constant-size sets and datasets with heavy tails follow the heavy-tailed Fréchet distribution, whereas growing sets or those with light tails adhere to the more stable Gumbel distribution. These insights enable the execution of strict hypothesis tests to determine when influence is statistically excessive. We validate this approach through applications in machine learning benchmarks, biology, and economics, demonstrating its ability to replace informal heuristics with robust inference and resolve previously contested research findings.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...