Efficient and Training-Free Single-Image Diffusion Models
Title: A Training-Free, High-Efficiency Approach to Single-Image Diffusion Models
Abstract:
This study addresses the challenge of synthesizing images that replicate the internal structural characteristics of a single reference image, specifically focusing on how patch distributions are organized across various scales. While contemporary methods tackle this issue by training diffusion models on individual images, such processes are notoriously resource-intensive, often demanding hours of computational optimization. In contrast, our methodology constructs a finite dataset comprising patches from the reference image at multiple resolutions. Given the limited size of this dataset and the low dimensionality of the patches, we can efficiently calculate the score function for noisy patches by employing an optimal, closed-form denoiser. This technique removes the necessity for neural network training entirely.
We embed this patch-based denoising mechanism into a streamlined, training-free image diffusion framework and explore its theoretical links to traditional patch-based image restoration methods. Our results indicate that this approach surpasses trained single-image diffusion models in both diversity and generation quality. We illustrate its versatility through several applications, including unconditional image synthesis, text-driven stylization, image symmetrization, and retargeting. Furthermore, we demonstrate that the method integrates seamlessly with latent space diffusion. By leveraging multiple acceleration strategies, our system is capable of generating megapixel-resolution images in just one second and gigapixel images within minutes.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC



