arXiv

You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models

Title: You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models

Original: arXiv:2603.00133v2 Announce Type: replace-cross Abstract: Generative models have been shown to "memorize" certain training data, leading to verbatim or near-verbatim generating images, which may cause privacy concerns or copyright infringement. We introduce Guidance Using Attractive-Repulsive Dynamics (GUARD), a novel framework for memorization mitigation in text-to-image diffusion models. GUARD adjusts the image denoising process to guide the generation away from an original training image and towards one that is distinct from training data while remaining aligned with the prompt, guarding against reproducing training data, without hurting image generation quality. We propose a concrete instantiation of this framework, where the positive target that we steer towards is given by a novel method for (cross) attention attenuation based on (i) a novel statistical mechanism that automatically identifies the prompt positions where cross attention must be attenuated and (ii) attenuating cross-attention in these per-prompt locations. The resulting GUARD offers a surgical, dynamic per-prompt inference-time approach that, we find, is by far the most robust method in terms of consistently producing state-of-the-art results for memorization mitigation across two architectures and for both verbatim and template memorization, while also improving upon or yielding comparable results in terms of image quality.

Rewritten: Title: You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models

Abstract: The tendency of generative models to "memorize" specific segments of their training sets can result in the production of images that are exact or nearly exact copies of source material, raising significant issues regarding copyright violations and privacy breaches. To address this, we present Guidance Using Attractive-Repulsive Dynamics (GUARD), a new framework designed to mitigate memorization within text-to-image diffusion models. GUARD modifies the denoising phase of image generation to steer outputs away from existing training images toward novel, distinct visuals that still adhere to the user's prompt. This process prevents the replication of training data while maintaining high image generation quality. We provide a specific implementation of this framework, utilizing a new technique for attenuating cross-attention to define the positive target for generation. This technique relies on (i) a statistical method that automatically detects which prompt positions require cross-attention reduction, and (ii) the actual attenuation of cross-attention at these identified locations. Our findings indicate that GUARD represents a highly robust, dynamic, per-prompt approach applied at inference time. It consistently achieves state-of-the-art performance in mitigating both template and verbatim memorization across two different model architectures. Furthermore, GUARD either enhances or matches the quality of the generated images compared to existing methods.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Advantech's Tsai on Nvidia Collaboration, AI Strategy
Bloomberg

Advantech's Tsai on Nvidia Collaboration, AI Strategy

Advantech's Tsai discusses the Nvidia partnership and AI strategy.

SK Hynix to Double Wafer Capacity to Ease Memory Chip Crunch
Bloomberg

SK Hynix to Double Wafer Capacity to Ease Memory Chip Crunch

SK Hynix plans to double its wafer capacity to alleviate the ongoing global memory chip shortage. This expansion aims to...

AI Productivity Boost Is Overhyped | 3-Minute MLIV
Bloomberg

AI Productivity Boost Is Overhyped | 3-Minute MLIV

The video argues that AI’s productivity boost is overhyped, challenging the assumption that it will significantly enhanc...

Intel's Lip-Bu Tan on Agentic AI & Partner Networks
Bloomberg

Intel's Lip-Bu Tan on Agentic AI & Partner Networks

Intel’s Lip-Bu Tan discusses Agentic AI and the vital role of partner networks in driving innovation.

Haas Says Arm May Hit $15 Billion AI Chip Revenue Goal Early
Bloomberg

Haas Says Arm May Hit $15 Billion AI Chip Revenue Goal Early

Haas suggests Arm may achieve its $15 billion AI chip revenue target sooner than expected. This indicates strong market ...

Arm May Hit $15 Billion AI Chip Revenue Goal Early, CEO Says
Bloomberg

Arm May Hit $15 Billion AI Chip Revenue Goal Early, CEO Says

Arm’s CEO predicts the company could hit its $15 billion AI chip revenue target ahead of schedule. This optimistic outlo...