arXiv

Off-Distribution Voices: Fanfiction Subgenres as Universal Vernacular Jailbreaks for Aligned LLMs

Title: Off-Distribution Voices: Fanfiction Subgenres as Universal Vernacular Jailbreaks for Aligned LLMs

Abstract: Current jailbreak techniques targeting aligned large language models (LLMs) rely on discrete artifacts that are easily identifiable via fingerprinting and susceptible to patching. We contend that the fundamental vulnerability lies not in individual prompts, but in a specific register of natural human writing that has been insufficiently covered during safety training. Leveraging this insight, we present the inaugural jailbreak family that utilizes authentic fanfiction subgenres as universal vectors for attack. This method employs a creative-writing meta-framework conditioned on excerpts from twelve distinct subgenres within the Archive of Our Own (AO3), embedding the harmful behavior as the narrative climax of the generated scene. This approach operates without requiring an attacker LLM or any per-target adaptation. Evaluated across eight aligned LLMs using the combined HarmBench and JailbreakBench datasets, the attack increased the mean Attack Success Rate (ASR) from 0.278 to 0.731, as measured by a four-judge ensemble. Factorial decomposition analysis reveals that this performance gain is driven by the linguistic register rather than text length or structural complexity. Furthermore, two active defense mechanisms were found to widen, rather than reduce, the disparity between vernacular and baseline ASRs, suggesting that defenses targeting specific templates inadvertently push attackers toward register-based strategies such as the one proposed here. Finally, we introduce SAGA-A4, a static four-turn extension that achieves a mean ASR of 0.924, significantly outperforming three existing multi-turn methods.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...