arXiv

MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition

June 2, 2026 · Yifan Bao, Xinyu Xi, Xinyu Liu, Wen Ge, Lei Jiang, Kevin Zhang, Raad Khraishi, Yihao Ang, Anthony K. H. Tung, Lukasz Szpruch, Hao Ni · Original Source

Title: MOSAIC: A Framework for Modular Orchestration in Structured Agentic Intelligence and Composition

Abstract

Automated data science functions as a structured problem of model selection, requiring the strategic choice of data transformations, feature representations, architectural designs, training methodologies, evaluation protocols, and refinement strategies for any given task. While existing AutoML systems automate segments of this workflow, they are generally limited to searching within fixed spaces of pipelines, models, and hyperparameters. In contrast, Large Language Model (LLM)-based agents provide enhanced flexibility via code generation, retrieval mechanisms, and execution feedback; however, their modeling decisions frequently lack structure, making them difficult to verify and reuse.

To address these limitations, we present \textsc{MOSAIC} (Modular Orchestration for Structured Agentic Intelligence and Composition), a framework designed for memory-grounded model selection and workflow construction. \textsc{MOSAIC} operates by analyzing a task and its associated dataset to generate a semantic task profile. It then retrieves relevant historical cases and source-code modules to construct a blueprint—an intermediate representation that defines the selected modeling components, their composition, interface constraints, and execution requirements. This approach transforms model selection into a staged, context-aware search and anchors LLM-driven code generation in retrieved evidence rather than relying on unconstrained synthesis.

Candidate models undergo validation through execution and are subsequently refined using diagnostic feedback, training traces, task metrics, and a reinforcement learning policy sensitive to failures. We demonstrated the capabilities of \textsc{MOSAIC} in the domain of financial time-series forecasting and generation, where models are required to meet strict criteria for predictive accuracy, distributional fidelity, execution reliability, and downstream financial metrics such as risk and tail behavior. Comparative experiments against AutoML and agentic baselines indicate that \textsc{MOSAIC} enhances task performance, execution success rates, and decision traceability. These results underscore the benefits of approaching automated data science as a structured, reusable, and execution-grounded model selection process.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC