arXiv

Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification

Title: Contextual Synthetic Data Generation for Fine-Grained Classification

While text-to-image (T2I) models are becoming standard tools for creating synthetic datasets, producing high-quality training data for classification tasks remains a significant hurdle. Although fine-tuning a T2I model using a limited set of real-world examples can enhance output quality, this approach often leads to overfitting and a consequent loss of diversity in the generated samples. To address these challenges in the context of fine-grained classification, we introduce BOB (BeyondOBjects), a novel fine-tuning strategy.

BOB operates by first extracting class-agnostic features—such as object pose and scene background—from a small collection of real examples. During the fine-tuning phase of the T2I model, these attributes are used as explicit conditions. However, during the actual data generation process, these conditions are marginalized out. This architectural choice helps prevent overfitting, maintains the model’s inherent generative prior, lowers estimation errors, and reduces spurious correlations between different classes.

Our extensive evaluation across various T2I architectures, backbone networks, and datasets demonstrates that BOB delivers state-of-the-art results for low-shot fine-grained classification when synthetic data is incorporated. Specifically, on the Aircraft dataset, BOB improved accuracy by 7.4% compared to the DataDream method, raising performance from 50.0% to 57.4% when a CLIP classifier was fine-tuned using five real images alongside 100 synthetic ones. Furthermore, in three out of four tested benchmarks, fine-tuning downstream models with just five real images augmented by BOB yielded superior results compared to using ten real images. Overall, BOB surpassed previous methods in 18 of 24 experimental configurations, delivering accuracy gains of more than 2% in 14 of those cases.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...