arXiv

VCIFBench: Evaluating Complex Instruction Following for Video Understanding

June 4, 2026 · Huangchen Xu, Yuan Wu, Yi Chang · Original Source

Title: VCIFBench: Assessing Complex Instruction Adherence in Video Analysis

Abstract: While multimodal large language models (MLLMs) have demonstrated significant advancements in video comprehension, current evaluation frameworks predominantly utilize straightforward prompts and offer insufficient proof regarding a model’s capacity to meet specific output requirements. To address this gap, we present VCIFBench, a novel benchmark designed to test complex instruction following within the realm of video understanding. VCIFBench generates instructions rich in constraints by leveraging both adapted benchmark prompts and those directly grounded in video content, encompassing demands related to content, format, style, and structure. Model responses are assessed using a hybrid verification approach. The dataset comprises 306 test instructions that are satisfiable, a preference dataset for Direct Preference Optimization (DPO) containing 540 pairs, and a diagnostic subset of 30 items aimed at identifying conflicts. Our experiments involving 10 MLLMs reveal that jointly satisfying multiple constraints remains a difficult task. Furthermore, we demonstrate that applying DPO training on the VCIFBench data leads to enhanced performance in instruction following.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

TechCrunch

Meta’s Oversight Board says account bans lack due process, transparency

June 4, 2026

Meta’s Oversight Board criticized account bans for lacking due process and transparency, citing inconsistent enforcement...

TechCrunch

Meta rolls out a new AI creator assistant on Facebook

June 4, 2026

Meta launched an AI creator assistant on Facebook to streamline analytics and content brainstorming. Initially available...

TechCrunch

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

June 4, 2026

WWDC 2026 promises a Siri revamp powered by Google’s Gemini and standalone app, plus AI agents in the App Store and Came...

TechCrunch

A burglar used a Waymo to steal yoga clothes in San Francisco — and got away with it

June 4, 2026

A thief stole yoga clothes using a Waymo, but police failed to catch them because the car’s video data was deleted and b...

Bloomberg

Goldman Sachs CEO David Solomon on the Coming Mega IPOs

June 4, 2026

Goldman Sachs CEO David Solomon anticipates a surge in major IPOs, signaling renewed market confidence and significant o...

New York Times

What Are A.I. Agents Actually Doing?

June 4, 2026

Arena research shows tech professionals are most likely to use AI agents at work, highlighting a strong industry trend i...

Top international news

VCIFBench: Evaluating Complex Instruction Following for Video Understanding

Related Articles

Meta’s Oversight Board says account bans lack due process, transparency

Meta rolls out a new AI creator assistant on Facebook

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

A burglar used a Waymo to steal yoga clothes in San Francisco — and got away with it

Goldman Sachs CEO David Solomon on the Coming Mega IPOs

What Are A.I. Agents Actually Doing?