arXiv

TAP-JEPA: Frozen Future-Latent Probing and Two-Stage Score Fusion for EPIC-KITCHENS-100 Action Anticipation

Title: TAP-JEPA: Two-Stage Score Fusion and Frozen Future-Latent Probing for EPIC-KITCHENS-100 Action Anticipation

Abstract

This paper introduces TAP-JEPA, which secured the second position in the EPIC-KITCHENS-100 (EK-100) Action Anticipation Challenge at EgoVis 2026. The challenge requires participants to predict the subsequent verb, noun, or combined verb-noun action based on an egocentric video clip that concludes prior to the onset of the target activity. Rather than employing fine-tuning on a massive video backbone, TAP-JEPA constructs a lightweight anticipation model utilizing frozen features from V-JEPA 2.1. In this architecture, a ViT-G/384 encoder processes visible pre-action tokens, while a pre-trained latent predictor infers near-future tokens from the available context. These two token sets are subsequently integrated via attentive probes equipped with task-specific queries designed for verbs, nouns, and action pairs.

For the final entry, we augmented supervised training by incorporating the official training split alongside the majority of the validation split, keeping only a minimal subset aside for sanity checks and qualitative assessment. Additionally, we implemented a two-stage score fusion strategy: first, we averaged results from eight independently initialized probe replicas within each epoch, and second, we combined candidate outputs from epochs 12 through 20 using field-dependent weights. On the official open-testing leaderboard, our team (sunshinesky) achieved an overall action Mean Top-5 Recall (MT5R) of 27.91 percent. This performance placed us second, trailing the leading score by a narrow margin of just 0.04 percentage points.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...