arXiv

Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection

June 3, 2026 · Sihang Zeng, Matthew Thompson, Ruth Etzioni, Meliha Yetisgen · Original Source

Title: Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection

Abstract:

Constructing patient trajectories from longitudinal electronic health records (EHRs) demands sophisticated reasoning capabilities to handle sparse, noisy, and lengthy multimodal sequences. While current large language model (LLM)-based multi-agent architectures effectively manage context length, they typically evaluate patients in isolation. This approach overlooks a critical aspect of clinical practice: the reliance on accumulated experience derived from analogous prior cases. To address this limitation, we introduce Traj-Evolve, a self-evolving multi-agent framework featuring two synergistic evolution mechanisms.

The first mechanism employs an Experience Pool (ExPool), functioning as a non-parametric memory system. It indexes rejection-sampled reasoning traces to identify and retrieve similar patient cases, thereby providing few-shot contexts. The second mechanism utilizes multi-agent reinforcement learning (MARL), specifically through reward-ranked fine-tuning, to parametrically enhance the collaboration between agents and the memory system. By integrating a leave-one-out cross-retrieval strategy, the system unifies these two approaches, ensuring that training and inference behaviors remain aligned under retrieval augmentation.

We evaluated Traj-Evolve on a lung cancer prediction task using multimodal EHR data spanning up to five years. The results demonstrate that our system surpasses nine strong baseline models, both within the general population and in the more challenging never-smoker subgroup. Further analysis of the evolving dynamics yields three primary insights: (1) enlarging the ExPool transitions the optimal retrieval strategy from diverse samples to more specific ones; (2) under MARL, the manager agent’s prediction loss stabilizes rapidly, whereas worker agents continue to enhance their temporal reasoning as they encounter more verified patients; and (3) the two mechanisms exhibit complementary effects on predicted risk, with ExPool boosting specificity and MARL enhancing sensitivity.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC