arXiv

Mean-based algorithms: A lower bound and regret

Title: Mean-Based Algorithms: Establishing Lower Bounds and Regret Analysis

Abstract

Mean-based algorithms constitute a category of online learning methods that prioritize actions based on their historical average rewards, typically assigning lower selection probabilities to those with inferior performance. While recent literature suggests that these algorithms effectively converge toward serially undominated actions—serving as approximations for Nash equilibria in economic contexts—empirical evidence indicates they may exhibit slower convergence rates than established methods in bandit-feedback environments.

This study investigates mean-based algorithms under conditions where the time horizon is unknown and only bandit feedback is accessible. We present the inaugural lower bound concerning the algorithm-defining sequence $\gamma_t$, which rigorously defines the theoretical limit on the learning speed of such algorithms. Furthermore, we introduce two novel mean-based algorithms: one that serves as a generalization of $\epsilon$-greedy strategies, and another that extends the mean-based Exp3 framework to accommodate unknown time horizons.

Our experimental results demonstrate that while mean-based algorithms may be marginally slower, they remain competitive with other bandit-feedback approaches. Additionally, we explore the connection between mean-based methods and no-regret algorithms. We reveal that the intersection of these two classes is non-trivial and dependent on the selection of $\gamma_t$, proving the existence of algorithms that satisfy both the mean-based and no-regret criteria. These findings provide deeper insight into the "exploitability" of this algorithmic class, building upon insights from previous research.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...