arXiv

Tree-Guided Identify-Then-Exploit: A Unified Framework of Best Arm Identification and Regret Minimization for Dueling Bandits

Title: Tree-Guided Identify-Then-Exploit: A Unified Framework for Best Arm Identification and Regret Minimization in Dueling Bandits

Abstract:

This paper investigates $N$-armed stochastic dueling bandits operating under the Condorcet-winner assumption, focusing on three prevalent objectives: best-arm identification (BAI), weak regret, and strong regret. To the best of our knowledge, we introduce Tree-Guided Identify-Then-Exploit (TG-ITE), the inaugural unified framework capable of addressing all three goals simultaneously. By avoiding the need for stricter assumptions, we introduce a common tree-guided identification mechanism designed to locate a high-confidence incumbent in $O(N)$ comparisons. Furthermore, we develop distinct exploitation strategies that leverage this warm-start phase to optimize for the specific objective at hand.

This methodology yields four key advantages: (1) it attains $O(N)$ sample complexity for BAI without relying on the stronger assumptions typically required; (2) it establishes the first winner-stays-style algorithm capable of achieving $O(N)$ weak regret; (3) it matches the $O(N \log T)$ performance guarantee characteristic of specialized strong-regret methods; and (4) it enables the joint optimization of BAI and weak regret, providing $O(N)$ guarantees for both and thereby closing the $O(\log N)$ sub-optimality gap present in prior approaches. These findings suggest that the trade-off between BAI and regret minimization is comparatively mild within the context of dueling bandits.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...