arXiv

Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning

Title: Ensuring Temporal Consistency in Episodic Memory for Cooperative Multi-Agent Reinforcement Learning

Abstract:

Cooperative Multi-Agent Reinforcement Learning (MARL) is often hindered by significant challenges related to reward sparsity and limitations in exploration. Although episodic memory strategies help alleviate these problems by leveraging high-return trajectories, they can inadvertently cause agents to settle into local optima. This occurs because unconstrained incentive distribution and semantic representation collapse undermine performance. To overcome these obstacles, we introduce Episodic Memory Temporal Consistency (EMTC), a robust framework designed for the construction and selective utilization of historical experiences.

EMTC comprises two complementary components. First, it employs a Temporally Consistent Semantic Embedder, which combines contrastive learning with time-conditioned state reconstruction. This approach prevents representation collapse and facilitates accurate memory retrieval. Second, the framework features a Temporal Consistency Gating Mechanism that dynamically adjusts episodic incentives according to temporal consistency errors. By filtering out misleading cues from trajectories that appear successful but are flawed, this adaptive gate effectively reduces Q-value overestimation.

We establish theoretical guarantees for the framework, deriving a strict error bound that connects observable temporal consistency errors to both the quality of representations and the optimality of the underlying trajectory. Comprehensive evaluations on the GRF and SMAC benchmarks show that EMTC consistently surpasses state-of-the-art baselines. Specifically, when compared to the leading episodic baseline, EMTC yields win-rate enhancements of up to 24% in super-hard SMAC scenarios and an average gain of 28% across GRF tasks.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

TechCrunch

Meta’s Oversight Board says account bans lack due process, transparency

Meta’s Oversight Board criticized account bans for lacking due process and transparency, citing inconsistent enforcement...

Fed's Daly Says Forward Guidance Could Be Misleading
Bloomberg

Fed's Daly Says Forward Guidance Could Be Misleading

Fed’s Daly warns forward guidance may be misleading or lack clarity.

TechCrunch

Meta rolls out a new AI creator assistant on Facebook

Meta launched an AI creator assistant on Facebook to streamline analytics and content brainstorming. Initially available...

TechCrunch

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

WWDC 2026 promises a Siri revamp powered by Google’s Gemini and standalone app, plus AI agents in the App Store and Came...

TechCrunch

A burglar used a Waymo to steal yoga clothes in San Francisco — and got away with it

A thief stole yoga clothes using a Waymo, but police failed to catch them because the car’s video data was deleted and b...

Goldman Sachs CEO David Solomon on the Coming Mega IPOs
Bloomberg

Goldman Sachs CEO David Solomon on the Coming Mega IPOs

Goldman Sachs CEO David Solomon anticipates a surge in major IPOs, signaling renewed market confidence and significant o...