When to Re-Plan: Subgoal Persistence in Hierarchical Latent Reasoning
**Title: Optimizing Re-Planning: The Role of Subgoal Persistence in Hierarchical Latent Reasoning
Abstract: Effective long-horizon reasoning demands a delicate balance: a system must adhere to medium-term objectives without becoming overly rigid. Frequent re-planning prevents the emergence of coherent multi-step structures, while excessive commitment causes plans to become obsolete. This paper investigates the trade-off between stability and adaptability within latent reasoning frameworks, where multi-step processing unfolds internally within hidden states rather than through explicit token sequences.
We enhance the Hierarchical Reasoning Model (HRM) by introducing a feudal manager-worker architecture. In this setup, a slower, high-level module issues a normalized directional subgoal that remains active for a specific duration of P low-level steps. This mechanism biases the worker’s hidden-state updates and introduces an intrinsic cosine alignment loss.
Experiments on the ARC and ConceptARC benchmarks reveal that subgoal persistence is the critical factor, surpassing the impact of subgoal injection alone. Moderate persistence periods (P between 3 and 6) consistently yield superior results compared to either very frequent updates (P=1) or extended horizons. Specifically, a persistence of P=3 achieved the lowest language model loss of 1.544, significantly outperforming the P=1 result of 1.674 and the 1.640 baseline. These findings were validated across five random seeds, yielding a mean loss of 1.595 with a standard deviation of 0.045. Additionally, the intrinsic alignment weight (lambda) exhibited a narrow optimal range around 0.05.
Controlled ablation studies, which isolated variables at the optimal lambda value, demonstrate that the interference observed when the alignment signal becomes too strong stems from learned directional structures rather than mere architectural capacity or the presence of auxiliary losses. Collectively, these results suggest a fundamental design principle for compositional planning in latent reasoning: medium-horizon intent must remain coherent across a sufficient number of computational steps to allow compositional structures to develop.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



