PaperVoyager : Building Interactive Web with Visual Language Models
Title: PaperVoyager: Creating Interactive Web Experiences via Visual Language Models
Abstract: The latest developments in visual language models have empowered autonomous agents to handle complex reasoning, utilize tools, and comprehend documents. Despite these advancements, current document agents primarily convert papers into static outputs like summaries, web pages, or slide decks. These static formats fall short when addressing technical papers that rely on dynamic mechanisms and state transitions. To address this limitation, we introduce a Paper-to-Interactive-System Agent capable of transforming research papers into executable, interactive web systems. Operating without human intervention, this agent manages the entire workflow—from understanding the paper and modeling the system to synthesizing the interactive web page—allowing users to adjust inputs and witness dynamic behaviors in real time. To assess this capability, we present a benchmark comprising 19 research papers alongside expert-constructed interactive systems as the ground truth. Additionally, we propose PaperVoyager, a structured generation framework that explicitly captures mechanisms and interaction logic during the synthesis process. Our experiments demonstrate that PaperVoyager markedly enhances the quality of the resulting interactive systems, establishing a novel paradigm for understanding scientific papers through interactivity.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





