arXiv

Fine-Tuning Diffusion Models for Molecular Generation via Reinforcement Learning and Fast Sampling

June 2, 2026 · Guang Lin, Shikui Tu, Lei Xu · Original Source

Title: Accelerating Molecular Generation with Reinforcement Learning and Rapid Sampling in Fine-Tuned Diffusion Models

Structure-based drug design (SBDD) faces a significant hurdle: creating molecules that not only exhibit desirable drug-like characteristics but also align precisely with the three-dimensional architecture of a target protein. Current generative methods often struggle with this task, typically necessitating expensive post-processing steps during sampling or depending on meticulously curated training datasets, yet they still deliver only marginal improvements. These issues are particularly acute in multi-objective scenarios, where reconciling competing requirements remains a persistent difficulty.

To overcome these obstacles, we introduce FTDiff, a reinforcement learning-based fine-tuning framework specifically designed for diffusion models generating molecules under structural constraints. To guarantee stable and sample-efficient optimization, FTDiff employs a Group Relative Policy Optimization (GRPO) approach. Additionally, the framework leverages a time-free pretrained diffusion model integrated with a rapid sampling mechanism. This innovation drastically cuts down the number of denoising steps required, thereby speeding up both training and inference processes without compromising the quality of the generated output.

By optimizing a reward function based on fixed thresholds, FTDiff effectively directs the model to generate molecules that are valid, diverse, and high-quality, successfully balancing various objectives in drug design. Comprehensive experiments conducted on benchmark datasets reveal that FTDiff consistently surpasses previous methodologies. Notably, it achieves these superior results without the need for costly post-hoc optimization or complex data engineering efforts.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC