arXiv

SN-WER: Script-Normalized WER for Multi-Script Indic ASR Evaluation

Title: SN-WER: A Script-Normalized Approach to WER in Multi-Script Indic ASR Assessment

Abstract:

While Word Error Rate (WER) remains the standard metric for evaluating automatic speech recognition (ASR) systems, it tends to inflate error counts when reference and hypothesis texts represent the same words using different scripts. This discrepancy is particularly prevalent in multilingual contexts where ASR models frequently output romanized text. To address this, we introduce Script-Normalized WER (SN-WER), a scoring mechanism designed exclusively for evaluation that requires no training. SN-WER operates by transliterating both the reference and hypothesis texts into a canonical script specific to the language prior to calculating WER.

Our assessment of SN-WER encompasses five Indic languages, two distinct datasets, and three different ASR models. Results from curated FLEURS data demonstrate that SN-WER can shrink artificially widened model performance gaps by as much as 12%. Conversely, on the noisier Common Voice dataset, the reductions in error rates are either minimal or inconsistent, suggesting that these discrepancies stem from actual recognition failures rather than mere script mismatches.

Further controlled stress tests reveal that SN-WER mitigates 67% of the WER inflation caused by artificial romanization. Additionally, controls involving lexical substitutions indicate that SN-WER maintains sensitivity to semantic errors comparable to standard WER, with a Delta SN-WER to Delta WER ratio of approximately 1.09. The method proves robust against variations in transliterator selection and normalization techniques, exhibiting token-collision rates under 0.1% in the tested Indic environments. We contend that SN-WER should be adopted as a companion metric alongside WER and Character Error Rate (CER) for script-agnostic ASR evaluation, particularly in scenarios where transcripts are utilized for downstream tasks such as search, indexing, or multilingual large language model pipelines.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...