arXiv

AUDDT: A Unified Benchmark Toolkit for Audio and Speech Deepfake Detectors

June 4, 2026 · Yi Zhu, Heitor R. Guimar\~aes, Arthur Pimentel, Tiago Falk · Original Source

Title: AUDDT: A Comprehensive Benchmarking Framework for Audio and Speech Deepfake Detection

Abstract:

As AI-generated media, particularly audio deepfakes, becomes increasingly common, significant research effort has been directed toward creating effective detection mechanisms. Despite this progress, current benchmarking resources rely on limited datasets, casting doubt on how well these detectors perform in real-world environments. To address this gap, this study provides a systematic review of 31 existing audio deepfake datasets and introduces AUDDT, an open-source benchmarking toolkit (available at https://github.com/MuSAELab/AUDDT).

AUDDT is designed to streamline the assessment of pretrained detectors by testing them against a broad spectrum of speech and non-speech audio collections. It offers users immediate insights into the strengths and weaknesses of their models across various manipulation techniques and recording scenarios. The paper first demonstrates the toolkit’s functionality, outlines the benchmark’s structure, and categorizes different deepfake subgroups.

A key distinction of AUDDT compared to prior efforts is its capacity for large-scale, heterogeneous evaluation of contemporary spoofing methods, supported by detailed attribute-level analysis via extensive metadata annotation. By applying a widely used pretrained detector, we report both in-domain and out-of-domain detection metrics, which expose significant performance fluctuations depending on the conditions and types of audio manipulation. Finally, the study examines the constraints of current datasets and identifies discrepancies between existing resources and the requirements for practical deployment.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC