arXiv

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

June 2, 2026 · Dadi Guo, Yuejin Xie, Qingyu Liu, Weixian Huang, Jiayu Liu, Zhiyuan Fan, Qihan Ren, Shuai Shao, Tianyi Zhou, Jianjie Feng, Wenze Su, Yujiu Yang, Dongrui Liu, Yi R. Fung · Original Source

Title: Code2Math: Can Code Agents Effectively Evolve Math Problems Through Exploration?

Abstract:

As large language models (LLMs) progressively enhance their mathematical reasoning to approach International Mathematical Olympiad (IMO) standards and research-level complexity, the lack of challenging, high-quality problems has emerged as a critical constraint. This scarcity hinders the training, evaluation, and self-improvement cycles of these models. Concurrently, recent advancements in code agents have showcased advanced capabilities in agentic coding and logical reasoning, indicating that code execution environments can function as scalable platforms for mathematical experimentation.

This study explores the capacity of code agents to autonomously transform existing mathematical problems into more intricate variations. We present a multi-agent framework specifically engineered to evolve problems while rigorously verifying both the solvability and the heightened difficulty of the resulting outputs. Our experimental results show that, provided with adequate test-time exploration, code agents are capable of generating new problems that are not only structurally different from and more difficult than their source material but also remain solvable. These findings offer empirical support for the viability of code-driven agents as an effective method for synthesizing high-difficulty mathematical reasoning tasks within scalable computational settings. The associated code and data can be accessed at https://github.com/TarferSoul/Code2Math.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC