SciDER: Scientific Data-centric End-to-end Researcher
Title: SciDER: A Scientific Data-Centric End-to-End Researcher
Abstract: Although large language models have significantly sped up scientific discovery, current agent-based approaches encounter substantial hurdles regarding adaptability, domain generalization, and multimodal scalability. These systems often fail to independently handle raw, domain-specific experimental data. To address these challenges, we present SciDER, a multi-agent framework engineered to fully automate the research lifecycle with flexibility. This system utilizes a unique data-centric methodology, incorporating a dynamic multimodal skill set distributed among four distinct sub-agents. The process begins with an ideation agent that formulates new hypotheses through Evolutionary Idea Search. Subsequently, a data analysis agent organizes raw data systematically, while an experimentation agent generates executable code tailored to the specific characteristics of the dataset. Finally, a critic agent facilitates continuous, iterative self-improvement. To foster open-source scientific discovery, we are releasing OpenSciDER-SFT-8K, a high-quality dataset of execution trajectories, along with the fine-tuned OpenSciDER-27B model. Evaluations across six benchmarks demonstrate that both SciDER and OpenSciDER deliver competitive or top-tier performance, showing particularly notable improvements in data-centric analysis, end-to-end research execution, and multimodal scientific visualization. By combining data analysis with experimental execution, SciDER effectively closes the divide between abstract scientific reasoning and the synthesis of reproducible experiments.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






