CryoProt: A Protein Pretraining Framework with Cross-Box Interactions on Cryo-EM Density Maps
Title: CryoProt: A Protein Pretraining Framework Leveraging Cross-Box Interactions in Cryo-EM Density Maps
Abstract:
Although cryo-electron microscopy (cryo-EM) density maps are increasingly accessible, utilizing them for robust protein representation remains a significant hurdle. Current methodologies face two primary limitations: they lack a universal pretraining framework specifically adapted for cryo-EM density maps to predict protein-related properties, and they generally treat local box regions within density maps as isolated entities. This independent processing ignores cross-box interactions, which are critical for understanding the global structural context inherent in cryo-EM data.
To overcome these obstacles, we introduce CryoProt, a novel protein pretraining framework tailored for cryo-EM density maps. Central to CryoProt is a Map Encoder utilizing multi-head latent attention (MLA). This architecture allows box-level representations to interact via a shared latent space, thereby explicitly capturing cross-box dependencies within the density map. Additionally, we employ a multi-task pretraining strategy to acquire generalizable representations that transfer effectively to various downstream applications. Notably, these representations can be applied to tasks such as protein flexibility prediction, even when cryo-EM density maps are absent and must be implicitly inferred by the pretrained model.
Our experimental evaluations show that CryoProt consistently surpasses current state-of-the-art techniques across multiple benchmarks. It delivers improvements of up to 12% over the top-performing baselines, underscoring the value of modeling cross-box interactions in cryo-EM datasets. The source code for CryoProt is publicly accessible at https://anonymous.4open.science/r/CryoProt.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





