CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing
Title: CVEvolve: Enabling Autonomous Algorithm Discovery for Unstructured Scientific Data Processing
Abstract:
Domain scientists frequently encounter significant hurdles when analyzing scientific data, as they often lack the specialized expertise in computing or image processing required to develop task-specific algorithms or AI models. These challenges are particularly acute when dealing with data that is sparsely labeled, loosely defined, exhibits a high dynamic range, or contains substantial noise. To address this gap, we present CVEvolve, an autonomous agentic framework featuring a zero-code interface designed for discovering algorithms tailored to scientific data processing.
CVEvolve integrates a multi-round search methodology with a suite of capabilities, including code execution, evaluation implementation, history management, holdout testing, and optional visual inspection of both scientific data and output images. The system’s search process oscillates between discovery and refinement phases, employing lineage-aware stochastic candidate sampling to effectively balance exploration with exploitation.
We validated CVEvolve across four distinct applications: image registration in X-ray fluorescence microscopy, Bragg peak detection, image segmentation in high-energy diffraction microscopy, and hybrid analytical-learning-based affine registration. In each case, CVEvolve generated algorithms that outperformed baseline methods. Furthermore, tracking performance on holdout tests revealed that these autonomous discoveries often generalized better than subsequent, over-optimized alternatives. These findings illustrate how zero-code, autonomous, LLM-driven algorithm development can empower domain scientists to transform unstructured scientific image data into actionable algorithms and drive further scientific discovery.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




