arXiv

A Study of the Scale Invariant Signal to Distortion Ratio in Speech Separation with Noisy References

Title: Investigating the Scale-Invariant Signal-to-Distortion Ratio in Speech Separation Tasks Involving Noisy References

Abstract: This study explores the consequences of employing the Scale-Invariant Signal-to-Distortion Ratio (SI-SDR) as both a training objective and an evaluation metric for supervised speech separation, specifically when the reference signals used during training are corrupted by noise—a common scenario in the widely adopted WSJ0-2Mix benchmark. Mathematical analysis of the SI-SDR in the presence of noisy references indicates that such noise imposes a ceiling on the maximum achievable SI-SDR score or results in the unwanted inclusion of noise within the separated audio outputs. To mitigate these issues, the authors propose a strategy that improves the quality of reference signals and incorporates mixture data from the WHAM! dataset, with the goal of preventing models from memorizing noisy references. The performance of two models trained on these refined datasets was assessed using the non-intrusive NISQA.v2 metric. While the results demonstrate a decrease in noise levels within the separated speech, they also reveal that modifying the reference signals can introduce artifacts, thereby capping improvements in overall audio quality. Furthermore, a negative correlation was observed between SI-SDR values and perceived noisiness across various models tested on both the WSJ0-2Mix and Libri2Mix datasets, reinforcing the findings derived from the initial analysis.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...