Geometry-Preserving Encoder/Decoder in Latent Generative Models
Title: A Geometry-Preserving Encoder/Decoder for Latent Generative Models
Abstract: The primary objective of generative modeling is to synthesize new data instances that mirror the characteristics of a target dataset. While diffusion models are widely employed for this purpose, a significant hurdle lies in addressing the high dimensionality inherent in the input space. To circumvent this issue, contemporary methods perform diffusion within a latent space, utilizing an encoder to project data from its original domain into a reduced-dimensional representation. This strategy enhances computational efficiency and has yielded state-of-the-art performance. Currently, the variational autoencoder (VAE) serves as the predominant encoder/decoder architecture in this field, valued for its proficiency in acquiring latent representations and generating data. This study presents a novel encoder/decoder framework that possesses theoretical properties fundamentally different from those of the VAE, specifically engineered to maintain the geometric structure of the data distribution. We highlight the substantial benefits of employing this geometry-preserving encoder during the training phases of both the encoder and the decoder. Furthermore, we establish theoretical proofs demonstrating the convergence of the training process, offering convergence guarantees for encoder training and evidence that decoder training converges more rapidly when utilizing the geometry-preserving encoder.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




