Generate in Reconstruction Space, Match in Semantic Space: Transport Geometry for One-Step Generation
Title: Generating in Reconstruction Space, Matching in Semantic Space: Transport Geometry for One-Step Generation
Abstract:
Generative modeling and self-supervised representation learning (SSL) pursue fundamentally distinct optimization goals: the former prioritizes distributional fidelity, whereas the latter emphasizes semantic coherence. Although recent studies have consistently demonstrated that SSL features enhance generative training, the underlying mechanism driving this synergy remains elusive. This study investigates the utility of SSL within the context of one-step generation, a framework that explicitly leverages representations by employing frozen SSL features to align generated samples with real data. We utilize Sinkhorn divergence within this feature space as a computationally efficient surrogate for the Wasserstein distance, which serves as the population-level discrepancy metric approximated by Fréchet-style evaluations (e.g., FID). Our findings indicate that this objective achieves superior performance when applied to semantically structured SSL feature spaces, yielding a 39-fold reduction in ImageNet FID. We attribute this improvement primarily to the dynamics of matching estimation: by filtering out nuisance reconstruction details, semantic SSL features create a more compact geometry, thereby rendering distribution matching more tractable. Consequently, the optimal SSL features for training do not necessarily align with those used in evaluation metrics. For instance, we demonstrate that employing Inception as a feature extractor can artificially lower FID scores while simultaneously compromising matching stability and sample quality, a phenomenon indicative of metric hacking. Through extensive experiments on ImageNet, we delineate which SSL feature families yield the highest generation performance and propose matching stability as a quantitative metric for their selection. Code is available at https://github.com/Genentech/semantic-transport-generation.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





