Global News Digest

arXiv

OctoT2I: A Self-Evolving Agentic Text-to-Image Router

Title: OctoT2I: A Self-Evolving Agentic Text-to-Image Router

Abstract

As the landscape of Text-to-Image (T2I) models expands—ranging from massive architectures to streamlined, real-time variants—the industry is encountering diminishing returns from scaling individual models. To overcome this stagnation, agentic T2I approaches have emerged, leveraging multiple models to enhance output. However, current agentic solutions are hindered by three primary limitations: their dependence on costly handcrafted priors or human annotations, inflexible single-path decision-making processes, and a general disregard for inference efficiency.

In response to these issues, we present OctoT2I, an innovative agentic framework that redefines the T2I task as a joint optimization problem focusing on both generation quality and inference speed. OctoT2I utilizes a stateful, multi-round routing strategy that dynamically selects the most appropriate tool by leveraging its internal knowledge and memory. This adaptive selection is powered by a knowledge base constructed entirely through our novel Self-Evolving Mechanism, which operates without human supervision.

This mechanism first autonomously establishes foundational Conceptual Dimensions, such as style, color, and count. It then intelligently explores the combinations of these dimensions through an iterative "Propose--Solve--Evaluate--Learn" (PSEL) loop. By efficiently mapping the capability boundaries of each tool, the PSEL loop drives continuous improvement without the need for external guidance.

Extensive experiments confirm that OctoT2I strikes an exceptional balance between performance and efficiency. It achieves a competitive score of 0.96 on GenEval while delivering a 90.3% increase in inference speed and a 56.6% gain in energy efficiency compared to the leading baseline, Flow-GRPO. The associated code and models will be made publicly available.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.