Learning Implicit Bias in Generative Spaces for Accelerating Protein Dynamics Emulation
Title: Accelerating Protein Dynamics Emulation by Learning Implicit Bias in Generative Spaces
Abstract:
While generative emulators offer a computationally efficient alternative to molecular dynamics by producing plausible protein trajectories, they are constrained by their training data. Consequently, during long-horizon extrapolation, these models typically recycle familiar states rather than exploring rare configurations. Drawing inspiration from classical enhanced sampling techniques, we propose introducing an implicit, history-dependent bias within the generative space of a pretrained emulator. This approach employs a history-aware score estimator that applies a distance-weighted bias to the frozen emulator, effectively guiding reverse-time sampling away from previously encountered structures. This process is further stabilized by an environment-support term. To ensure structural integrity over extended periods, we implement a score-based refinement step that reprojects drifted samples back onto the data manifold via the frozen emulator. Our experimental results indicate that this method increases diversity by 35% on the DynamicPDB-80 dataset. Furthermore, in tests involving 12 zero-shot Fast-Folding proteins, the learned bias alone achieves coverage comparable to the unbiased emulator approximately 15 times faster. When combined with refinement, the method accelerates coverage by roughly 37 times and identifies approximately three times as many low-energy states. The associated code will be made publicly available soon.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




