arXiv

RAIGen: Rare Attribute Identification in Text-to-Image Generative Models

Title: RAIGen: Uncovering Rare Attributes in Text-to-Image Generative Models

Abstract:

While text-to-image diffusion models have demonstrated remarkable generation capabilities, they often perpetuate and exacerbate biases present in their training data, resulting in a skewed representation of semantic attributes. Existing research typically tackles this issue through one of two lenses: closed-set methods, which address biases within predefined fairness categories (such as race or gender) based on known socially significant minority attributes; or open-set methods, which treat the problem as bias identification by focusing on the majority attributes that overwhelmingly dominate model outputs. However, both approaches neglect a crucial complementary objective: the discovery of rare or minority features—whether social, cultural, or stylistic—that are underrepresented in the data distribution but still captured within model representations.

To address this gap, we present RAIGen, the first known framework for label-free rare-attribute discovery in diffusion models, which operates without the need for predefined minority categories. RAIGen utilizes Matryoshka Sparse Autoencoders alongside a novel metric for minority identification that integrates neuron activation frequency with semantic distinctiveness. This combination allows for the pinpointing of interpretable neurons, where the images triggering the highest activations expose underrepresented attributes. Our experimental results indicate that RAIGen successfully identifies attributes that fall outside standard fairness categories within Stable Diffusion. Furthermore, the framework is scalable to larger architectures like SDXL, facilitates systematic auditing across different models, and permits the targeted amplification of rare attributes during the generation process. More details can be found at https://vssilpa.github.io/RAIGen_webpage/.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...