arXiv

Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning

Title: Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning

Abstract:

While temporal credit assignment is a fundamental component of both biological and artificial intelligence, its interplay with non-linear function approximation remains largely unexplored. In this study, we uncover a systematic failure mode in deep reinforcement learning (RL) known as Trace-Mediated Peak Bias (TMPB). Specifically, we find that at intermediate eligibility trace depths, agents exhibit irrational preferences for trajectories featuring high-magnitude reward "peaks," even when alternative paths offer superior cumulative returns. This phenomenon offers a mechanistic explanation for the Peak-End Rule, a cognitive bias in human memory wherein experiences are evaluated based on their most intense moments rather than their integrated utility.

Our analysis reveals that TMPB arises because eligibility traces amplify distal Temporal Difference errors into "gradient shocks." These shocks cannot be normalized by fixed-step-size Stochastic Gradient Descent, resulting in global overestimation of values. In contrast, adaptive optimizers alleviate this issue through second-moment normalization. These findings imply that human-like saliency distortions may naturally arise from the mathematical constraints inherent in credit assignment within distributed systems, highlighting adaptive optimization as a theoretical prerequisite for rational value estimation.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia
Bloomberg

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia

Cerebras confirmed partnerships with all major AI hardware vendors except Nvidia. This broad engagement positions Cerebr...

Putin Turns Russia’s AI Future Into a Kremlin Family Business
Bloomberg

Putin Turns Russia’s AI Future Into a Kremlin Family Business

Putin is consolidating Russia’s AI ambitions into a Kremlin family business, effectively turning the sector into a dynas...

Reuters

Meta repeatedly pushes back new AI model release for developers, WSJ says

Meta has repeatedly delayed the release of its new AI model for developers, according to the WSJ. This ongoing postponem...