arXiv

Why Do Self-Harm Prediction Models Struggle to Generalise? Lexical and Semantic Variations in Emergency Department Triage Notes

Title: Challenges in Generalizing Self-Harm Prediction Models: The Impact of Lexical and Semantic Shifts in Emergency Department Triage Records

Abstract:

Presentations to emergency departments (EDs) involving self-harm are closely linked to an increased risk of suicide. While natural language processing (NLP) systems have demonstrated strong capabilities in identifying self-harm behaviors from triage notes within individual hospitals, their accuracy frequently diminishes when applied across different institutions. This study investigates the underlying reasons for this performance gap by comparing triage documentation from two distinct hospitals, focusing on lexical traits, key predictive features, and prominent topics. The analysis uncovers significant disparities in how self-harm is lexically expressed and which features hold predictive weight across sites, even though fundamental themes like self-injury and self-poisoning remain consistent. These variations in clinical documentation contribute to the decline in cross-institutional model performance. Our results offer valuable insights into how institutional differences influence the detection of self-harm in clinical narratives and suggest avenues for enhancing the generalizability of predictive models.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...