arXiv

Question-Aware Evidence Ledgers for Video Relational Reasoning

Title: Leveraging Question-Aware Evidence Ledgers for Video Relational Reasoning

Abstract:

The VRR-QA benchmark is designed to assess visual relational reasoning capabilities within video data. In this domain, determining the correct answer frequently hinges on nuanced factors such as implicit spatial relationships, event demarcations, the identity of specific targets, and conversational context, rather than relying on any single prominent frame. To address this, we introduce a test-time reasoning framework centered on a robust GPT-5.5 video question-answering solver, augmented by a collection of question-aware evidence ledgers.

Initially, the solver generates answers based on a standardized video representation. Subsequently, specific ledgers are activated to clarify essential elements required for various reasoning tasks—including counting, spatial analysis, endpoint detection, viewpoint assessment, and dialogue comprehension. These ledgers explicitly define targets, count units, reference frames, and temporal or spatial scopes.

We employ external tools such as open-vocabulary detection systems, depth information, pair crops, automatic speech recognition (ASR), and scene-graph ledgers exclusively as sources of evidence. A conservative gating mechanism is utilized to retain the solver’s original answer unless independent evidence distinctly validates an alternative option. This final, evidence-gated pipeline demonstrates strong performance, achieving an overall accuracy of 92.95% and a macro accuracy of 93.79% on the challenge’s test split.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...