arXiv

NormEval: A Unified Multi-Metric Framework for Evaluating Semantic Fidelity in Text Normalization

Title: NormEval: A Unified Multi-Metric Framework for Evaluating Semantic Fidelity in Text Normalization

Abstract:

Stemming and lemmatization are essential building blocks within natural language processing (NLP) workflows. However, as new normalization tools emerge for a variety of languages, the methods used to assess them remain disjointed. Current evaluations often rely on isolated metrics such as Compression Ratio, downstream accuracy, or sequence-to-sequence prediction scores. This fragmented approach fails to differentiate between useful vocabulary reduction and detrimental semantic distortion. Given that text normalization supports critical intelligent systems in high-stakes fields like legal document analysis and clinical decision support, a rigorous and principled evaluation methodology is crucial.

To address these challenges, this paper introduces NormEval, a comprehensive, multilingual evaluation framework. The framework integrates five complementary metrics: Compression Ratio (CR), Model Performance Delta (MPD), Information Retention Score (IRS), Algorithm Effectiveness Score (AES), and Average Normalized Levenshtein Distance (ANLD). Together, these metrics evaluate normalization quality across three distinct dimensions: macro-level efficiency, downstream utility, and micro-level morphological fidelity.

A central component of this framework is the "Safety Gate" hypothesis, which positions ANLD as an intrinsic structural hygiene check. By leveraging character-level divergence ($\Delta$), ANLD exposes aggressive mutations that might otherwise be masked by macro-level embeddings or downstream task performance. Comprehensive ablation studies conducted on both English and Bangla datasets demonstrate that every component of the framework is vital. The removal of any single metric degrades performance in at least one evaluation aspect, ultimately leading to inaccurate algorithm rankings.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...