arXiv

TextFake: Benchmarking AI-Generated Image Detection on Text-Rich Images

Title: TextFake: Evaluating the Detection of AI-Generated Images in Text-Heavy Contexts

Abstract

While current detectors for AI-generated images (AIGI) demonstrate strong performance on standard natural-image datasets, their effectiveness against text-rich forgeries—such as fake screenshots, documents, and news pages that are common in misinformation campaigns—has not been adequately assessed. To address this gap, we present TextFake, a new benchmark comprising 20,000 images designed for text-rich AIGI detection. This dataset covers 28 languages, four topic categories, and two scene modalities.

Our methodology for creating fake images involves a four-stage pipeline. We begin by annotating real images across three controlled dimensions, then generate corresponding fake versions using structured prompting that aligns with the underlying data distribution. This approach effectively eliminates covariate shortcuts.

When we conducted zero-shot evaluations on 14 specialized detectors and three state-of-the-art Vision-Language Model (VLM) APIs, we observed a significant systematic performance gap. No model achieved accuracy above 80%, and several saw their performance drop by more than 60% compared to their results on natural-image benchmarks. Our diagnostic analysis highlights three primary failure modes:

  1. The Text Density Curse: High concentrations of glyphs overwhelm detectors that rely on low-level features.
  2. Cloaking via Rendering Fidelity: High-quality text rendering masks generative artifacts, making them harder to detect.
  3. Threshold Collapse: Routine perturbations cause detectors to revert to chance-level performance.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...