arXiv

The Alignment Curse: Modality Alignment Supercharges Audio Attacks via Text Transfer

Title: The Alignment Curse: Modality Alignment Supercharges Audio Attacks via Text Transfer

Original: arXiv:2602.02557v2 Announce Type: replace-cross Abstract: Recent advances in end-to-end trained omni-models have substantially improved audio capabilities by strengthening text-audio modality alignment. However, whether such alignment inadvertently facilitates the transfer of safety vulnerabilities across modalities remains underexplored. This question is critical as text-based jailbreak attacks are considerably more mature than audio-based ones; if they transfer systematically, current audio safety evaluations may underestimate risks originating from the text modality. In this paper, we introduce the Alignment Curse, a formally characterized and empirically validated principle showing that stronger modality alignment enables more effective transfer of attacks from text to audio, revealing a fundamental tension between capability and safety. Motivated by this principle, we conduct a comprehensive black-box evaluation of three attack categories on recent omni-models (e.g., Qwen2.5-Omni, Qwen3-Omni): text attacks, text-transferred audio attacks, and audio attacks. We find that text-transferred audio attacks perform comparably to, and often better than, audio-based attacks, exhibiting a clear advantage under audio-only access. This suggests that text-based vulnerabilities play a pivotal role in shaping audio safety risks. Finally, we empirically analyze the relationship between modality alignment and transfer effectiveness across attack methods and models, observing consistent support for the Alignment Curse: tighter modality alignment leads to more effective cross-modality attack transfer.

Rewrite: The enhancement of audio performance in end-to-end trained omni-models has been driven by improvements in text-audio modality alignment. Yet, the extent to which this alignment might inadvertently allow safety vulnerabilities to migrate across modalities has not been thoroughly investigated. This gap is significant because jailbreak techniques targeting text are far more developed than those targeting audio. If these text-based exploits transfer systematically, existing assessments of audio safety might fail to account for dangers stemming from the text domain. To address this, we propose the "Alignment Curse," a principle that is both theoretically defined and empirically proven, demonstrating that enhanced modality alignment facilitates more efficient cross-modal attack transfer from text to audio. This finding highlights an inherent conflict between model capability and security. Guided by this insight, we performed a thorough black-box assessment on contemporary omni-models, including Qwen2.5-Omni and Qwen3-Omni, examining three distinct attack vectors: direct text attacks, audio attacks derived from text, and direct audio attacks. Our results indicate that audio attacks generated through text transfer are as effective as, and frequently superior to, native audio attacks, particularly when only audio input is available. This outcome implies that vulnerabilities in the text domain significantly influence audio safety concerns. Furthermore, our empirical analysis of the link between alignment strength and transfer success across various models and attack types consistently reinforces the Alignment Curse, confirming that stricter modality alignment results in more potent cross-modal attack transfer.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Advantech's Tsai on Nvidia Collaboration, AI Strategy
Bloomberg

Advantech's Tsai on Nvidia Collaboration, AI Strategy

Advantech's Tsai discusses the Nvidia partnership and AI strategy.

SK Hynix to Double Wafer Capacity to Ease Memory Chip Crunch
Bloomberg

SK Hynix to Double Wafer Capacity to Ease Memory Chip Crunch

SK Hynix plans to double its wafer capacity to alleviate the ongoing global memory chip shortage. This expansion aims to...

AI Productivity Boost Is Overhyped | 3-Minute MLIV
Bloomberg

AI Productivity Boost Is Overhyped | 3-Minute MLIV

The video argues that AI’s productivity boost is overhyped, challenging the assumption that it will significantly enhanc...

Intel's Lip-Bu Tan on Agentic AI & Partner Networks
Bloomberg

Intel's Lip-Bu Tan on Agentic AI & Partner Networks

Intel’s Lip-Bu Tan discusses Agentic AI and the vital role of partner networks in driving innovation.

Haas Says Arm May Hit $15 Billion AI Chip Revenue Goal Early
Bloomberg

Haas Says Arm May Hit $15 Billion AI Chip Revenue Goal Early

Haas suggests Arm may achieve its $15 billion AI chip revenue target sooner than expected. This indicates strong market ...

Arm May Hit $15 Billion AI Chip Revenue Goal Early, CEO Says
Bloomberg

Arm May Hit $15 Billion AI Chip Revenue Goal Early, CEO Says

Arm’s CEO predicts the company could hit its $15 billion AI chip revenue target ahead of schedule. This optimistic outlo...