arXiv

Multilingual Long-Form Speech Instruction Following: KIT's Submission to IWSLT 2026

Title: KIT’s Approach to Multilingual Long-Form Speech Instruction Following for IWSLT 2026

Abstract: The emergence of Large Language Models has shifted the paradigm from single-task and token-based multi-task architectures to instruction-driven systems. These newer models implicitly deduce the target language and specific task requirements directly from natural language prompts. This evolution is evident in the IWSLT Instruction Following Track, which this year expanded its scope by introducing novel challenges, including an unforeseen "surprise" task designed to test robustness and prevent overfitting to previously seen examples. This paper details KIT’s entry into the unconstrained Long and Short Instruction Following tracks. Our methodology employs a comprehensive data augmentation strategy that transforms short-form corpora into long-form training sets. This process involves concatenating segments, generating labels via LLMs, and applying cross-lingual translation, ultimately producing a dataset exceeding one million instances spanning four languages and six distinct tasks. Additionally, we demonstrate that while likelihood-based re-ranking is highly successful for Automatic Speech Recognition (ASR), it systematically harms performance on semantic tasks. This degradation occurs because the model spuriously favors candidates derived from segmented audio processing rather than holistic long-form inference. We resolve this issue by integrating likelihood scores with Minimum Bayes Risk decoding.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

AI Concentration Risk Is the Problem: 3-Minutes MLIV
Bloomberg

AI Concentration Risk Is the Problem: 3-Minutes MLIV

The article argues that AI concentration risk, rather than the technology itself, is the primary concern. It highlights ...

Reuters

Foxconn announces strategic collaboration with Intel on next-gen AI infrastructure

Foxconn and Intel announced a strategic partnership to develop next-generation AI infrastructure. This collaboration aim...

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Reuters

Europe's tech 'liberation day'? Computer says not yet

Europe’s expected tech breakthrough remains unrealized, as current systems indicate that a true "liberation day" has not...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.