arXiv

When to Re-Plan: Subgoal Persistence in Hierarchical Latent Reasoning

**Title: Optimizing Re-Planning: The Role of Subgoal Persistence in Hierarchical Latent Reasoning

Abstract: Effective long-horizon reasoning demands a delicate balance: a system must adhere to medium-term objectives without becoming overly rigid. Frequent re-planning prevents the emergence of coherent multi-step structures, while excessive commitment causes plans to become obsolete. This paper investigates the trade-off between stability and adaptability within latent reasoning frameworks, where multi-step processing unfolds internally within hidden states rather than through explicit token sequences.

We enhance the Hierarchical Reasoning Model (HRM) by introducing a feudal manager-worker architecture. In this setup, a slower, high-level module issues a normalized directional subgoal that remains active for a specific duration of P low-level steps. This mechanism biases the worker’s hidden-state updates and introduces an intrinsic cosine alignment loss.

Experiments on the ARC and ConceptARC benchmarks reveal that subgoal persistence is the critical factor, surpassing the impact of subgoal injection alone. Moderate persistence periods (P between 3 and 6) consistently yield superior results compared to either very frequent updates (P=1) or extended horizons. Specifically, a persistence of P=3 achieved the lowest language model loss of 1.544, significantly outperforming the P=1 result of 1.674 and the 1.640 baseline. These findings were validated across five random seeds, yielding a mean loss of 1.595 with a standard deviation of 0.045. Additionally, the intrinsic alignment weight (lambda) exhibited a narrow optimal range around 0.05.

Controlled ablation studies, which isolated variables at the optimal lambda value, demonstrate that the interference observed when the alignment signal becomes too strong stems from learned directional structures rather than mere architectural capacity or the presence of auxiliary losses. Collectively, these results suggest a fundamental design principle for compositional planning in latent reasoning: medium-horizon intent must remain coherent across a sufficient number of computational steps to allow compositional structures to develop.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...