arXiv

Multi-Agent Computer Use

Title: Multi-Agent Computer Use

Abstract:

Current implementations of Computer Use Agents (CUAs) predominantly rely on a single, serial agent architecture. This approach is ill-suited for complex, long-horizon tasks that require task decomposition, parallel processing, and dynamic re-planning in response to new information. This paper advocates for a shift toward the development and evaluation of Multi-Agent Computer Use (MACU) systems. By prioritizing coordinated planning and parallel execution, these systems address numerous limitations inherent in single-agent CUAs.

We introduce a general multi-agent framework where a central manager model structures computer use tasks as a Directed Acyclic Graph (DAG). This structure encodes specific goals and dependencies for various subagents. During each iteration, the manager assigns parallel CUA subagents to execute nodes located on the ready frontier of the DAG. As subagents report new findings, the manager continuously updates the DAG by adding, canceling, or rewriting nodes. This architecture treats the partially observable nature of computer environments as a primary challenge; crucial information that downstream agents might not be able to re-observe is preserved and transmitted through the manager and the DAG structure.

Our results show that MACU consistently outperforms strong single-agent baselines, achieving improvements ranging from $3.4\%$ to $25.5\%$ across desktop benchmarks (OSWorld) and web navigation datasets (Online-Mind2Web, WebTailBench, and Odysseys). The system also demonstrates superior test-time scaling and successfully resolves complex, long-horizon tasks that typically cause single-agent CUAs to stall. Notably, on the Odysseys benchmark, which focuses on long-horizon web navigation, MACU reduced the average wall-clock time for task completion by approximately $1.5 \times$, proving its ability to accelerate traditionally sluggish CUA workflows. These findings underscore multi-agent coordination as a vital pathway for scaling computer use agents to operate more effectively and productively over extended periods. All code and interactive visualizations are available at https://jykoh.com/multi-agent-computer-use.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...