arXiv

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

Title: Trivium: Elevating Temporal Regret to a Primary Objective in Causal-Memory Control Systems

Abstract:

Current agentic systems and Large Language Model (LLM) pipelines typically rectify errors by optimizing for outcome rewards. However, this approach focuses exclusively on the "what" of failure. When an outcome deviates from a prediction, the system fails to systematically record, review, or correct the underlying causes and timing of the mismatch. Consequently, identical errors tend to repeat across episodes. We contend that this issue stems from a structural deficiency rather than a mere limitation in model capacity. To address this, we introduce long-horizon temporal regret as a primary objective, operating in tandem with outcome regret and epistemic regret within the framework of the working causal model.

Temporal regret quantifies the duration of failure persistence—specifically, how long a miscalibrated causal model is allowed to remain uncorrected. Epistemic regret, conversely, addresses the root cause of persistence by measuring residual uncertainty or error within the active causal model. Together, these three forms of regret provide a falsifiable framework for understanding what, why, and when long-lived agents fail.

By modeling the agent as a sequence of $E$ episodes, we derive three conditional results based on explicit assumptions regarding causal probing, persistence, and detectability. First, we demonstrate that under observationally equivalent confounding, outcome-only learning cannot differentiate between causal and spurious structures without an intervention channel. As a result, temporal miscalibration can persist linearly even after outcome regret is reduced to zero. Second, given a persistent causal log and budgeted probes, the total probe complexity remains logarithmic with respect to the episode horizon, resulting in $O(\log E)$ temporal regret. Third, in scenarios involving $K$ detectable change-points, this rate expands to $O(K \log E)$.

We instantiate the Trivium framework and pre-register five falsifiable predictions. Empirical tests on CausalBench-Seq show that Trivium adheres to the predicted logarithmic envelope, whereas outcome-only baselines exhibit linear growth. Furthermore, a pilot study involving a real-LLM stream offers preliminary evidence of external validity, covering one complete run of $E = 500$ episodes and three pilot runs of $E = 100$ using frontier models. It is important to note that self-learning in this context refers to the revision of an external causal model, not the retraining of LLM weights.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

TechCrunch

Benchmark raises its first-ever growth fund as part of $2B capital raise

Benchmark Capital launches its first growth fund, raising $2 billion to target later-stage AI deals. This marks a strate...

Netflix Aims to Use AI to Help Viewers Manage Content Overload
Bloomberg

Netflix Aims to Use AI to Help Viewers Manage Content Overload

Netflix uses AI to help viewers manage content overload, tackling the challenge of too many choices.

TSMC CEO Warns Chip Supply Won’t Meet AI-Fueled Demand for Years
Bloomberg

TSMC CEO Warns Chip Supply Won’t Meet AI-Fueled Demand for Years

TSMC CEO warns that chip supply will lag behind surging AI demand for years. This multi-year shortfall highlights the in...

Reuters

TSMC boss upbeat on outlook as AI boom shows no sign of easing

TSMC executives remain optimistic as sustained AI demand shows no signs of slowing, driving strong confidence in the com...

Bitcoin Falls to Pre-Iran Conflict Low as Crypto Slide Extends
Bloomberg

Bitcoin Falls to Pre-Iran Conflict Low as Crypto Slide Extends

Bitcoin drops to its lowest level before the Iran conflict, extending a broader cryptocurrency decline.

Why Amazon Has Struggled to Crack India
Bloomberg

Why Amazon Has Struggled to Crack India

Amazon’s aggressive push for dominance in India has stalled, marking the end of its ambitious expansion efforts. The 202...