Two-Fidelity Best-Action Identification for Stochastic Minimax Tree
Title: Dual-Fidelity Best-Action Identification for Stochastic Minimax Trees
Abstract: This paper investigates fixed-confidence best-action identification (BAI) within the context of stochastic minimax trees. This challenge is gaining prominence in contemporary AI planning, particularly as deep minimax search and Monte Carlo Tree Search (MCTS) utilizing language model long rollouts encounter a critical dilemma: heuristic assessments offer low cost but introduce bias, whereas precise rollouts provide reliability at a prohibitive computational expense. To address this, we introduce 2FFS, a novel tree-search algorithm that integrates multi-fidelity flat bandit principles into tree structures. By merging minimax-style rapid expansion with MCTS-style stochastic sampling, 2FFS dynamically determines when to leverage inexpensive, biased evaluations and when to employ costly, accurate evaluations for local verification. We demonstrate the algorithm’s fixed-confidence correctness, confirm its finite stopping condition for exact identification, and derive a polynomial-depth cost upper bound applicable to general-depth trees. Empirical results from numerical stochastic-tree experiments indicate that 2FFS significantly reduces both sample usage and computational operations compared to established BAI-MCTS baselines.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




