Response-Aware Multimodal Learning for Post-Treatment Visual Acuity Forecasting
Title: Leveraging Response-Aware Multimodal Learning to Predict Visual Acuity Following Treatment
Abstract: Accurate long-term visual acuity (VA) projections are vital for managing diabetic macular edema (DME), particularly for guiding patient counseling, setting realistic expectations, and scheduling follow-ups. However, clinicians frequently face the challenge of estimating these long-term trajectories using only early post-treatment data, a task that complicates reliable prognosis. While previous OCT-based methods have predominantly targeted short-term responses or single-timepoint predictions, there has been limited exploration into modeling VA trajectories across several future intervals based on early longitudinal observations.
To address this gap, we constructed a real-world cohort comprising 188 DME patients treated with anti-VEGF therapy. This dataset includes paired baseline and one-month OCT scans, supplemented by tabular OCT-derived biomarkers and non-imaging clinical variables. By utilizing solely these early data points, we defined a multi-horizon forecasting task designed to predict visual outcomes at clinically significant intervals: 3, 6, 12, 18, and 24 months.
We introduce ReVA, a novel response-aware multimodal framework. This model integrates structural features extracted from baseline and month-1 OCT images alongside tabular clinical data to capture both the initial disease state and early therapeutic responses. ReVA employs spatial attention mechanisms to retain localized prognostic features within the imaging data, while a dependency-aware tabular encoder is utilized to model complex interactions among clinical variables. These multimodal representations are subsequently fused to generate patient-specific long-term VA trajectories.
In evaluating 24-month VA prediction, the proposed framework demonstrated strong performance with an MAE of 0.1246, an RMSE of 0.1621, and an R² of 0.6064. This performance remained consistent across all tested forecast horizons. Our results indicate that integrating signals from early treatment responses facilitates clinically relevant long-term VA forecasting, thereby offering robust data-driven support for routine anti-VEGF management decisions.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





