arXiv

Model Multiplicity and Predictive Arbitrariness in Recidivism Risk Assessment

Title: Model Multiplicity and Predictive Arbitrariness in Recidivism Risk Assessment

Abstract:

Decision-making processes involving predictions about individual futures often face the challenge of inherent noise, which can result in the existence of multiple models that achieve comparable accuracy. When these models yield conflicting forecasts for the same person, it introduces significant concerns regarding the fairness and consistency of high-stakes decisions. This paper investigates the theoretical and practical magnitude of such arbitrariness and explores methods to mitigate it within risk assessment frameworks.

We examine these issues through an analysis of a machine learning-driven decision support system for recidivism risk assessment that has been operational for more than 15 years. To begin, we developed a dataset comprising thousands of inmate release cases by converting complex legal statutes into an algorithmic framework for labeling post-release outcomes as either recidivist or non-recidivist. Leveraging this data, we trained interpretable models that not only enhanced predictive accuracy but also narrowed error-rate disparities across demographic groups. Furthermore, these models were designed to ensure that evidence of rehabilitative progress resulted in lower risk scores.

Our investigation into predictive multiplicity involved two key steps: first, we established a tight lower bound on the expected level of agreement among any finite collection of models across a dataset; second, we assessed how structural variations—such as differences in model coefficients—within that collection manifested as predictive multiplicity, defined here as divergent predictions for identical individuals.

Our experimental results demonstrate that the presence of numerous models with similar accuracy and comparable error-rate disparities does not inevitably lead to severe predictive multiplicity. In practice, models with similar performance levels often show significantly higher agreement than the conservative limits suggested by worst-case theoretical bounds. Consequently, we identify a straightforward policy strategy—assigning each inmate the lowest risk score generated by the set of equally accurate models—as an effective solution to minimize predictive arbitrariness.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...