arXiv

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

June 2, 2026 · Sahil Rahman, Maxx Richard Rahman · Original Source

Title: AgentPLM: Reasoning-Augmented Decoding for Agentic Protein Language Models in Sequence Design

Abstract: Current protein language models (PLMs) function primarily as passive predictors, producing sequences through a single forward pass. This architecture lacks the capacity to seek external biophysical feedback or adjust the generation process when candidates fail to meet thermodynamic or structural requirements. To overcome these limitations, we present AgentPLM, a framework that integrates two key innovations into a pre-trained PLM: Reasoning-Augmented Decoding (RAD) and Contrastive Agent Policy Optimisation (CAPO). RAD enhances the autoregressive generation process by interspersing it with tool calls to external systems such as ESMFold, FoldX, and AutoDock Vina. Meanwhile, CAPO serves as a trajectory-level adaptation of direct preference optimisation, enabling end-to-end training that teaches the model to discern when oracle feedback is valuable, rather than simply mimicking high-fitness sequences. We assessed AgentPLM using benchmark tasks including de novo enzyme design, antibody optimisation, thermostability enhancement, protein-protein interaction (PPI) interface design, and zero-shot fitness prediction. These evaluations utilized standardised oracle APIs and controlled sequence-identity splits. The results demonstrate that AgentPLM attains state-of-the-art performance, notably improving the top-10% hit rate for antibodies compared to the most effective passive baseline. Furthermore, the study provides mechanistic evidence of the model’s ability to perform online error correction without relying on explicit backtracking.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC