arXiv

FlowNar: Scalable Streaming Narration for Long-Form Videos

Title: FlowNar: Enabling Scalable Streaming Narration for Extended Video Content

Abstract:

Although Large Multimodal Models (LMMs) have seen significant advancements, they are predominantly engineered for offline applications, making them poorly equipped to handle the fluid demands of streaming video. While recent efforts have adapted these models for online use to facilitate real-time processing, they continue to grapple with severe scalability limitations. Specifically, resource consumption tends to scale at least linearly as video duration increases. To address this bottleneck, we introduce FlowNar, an innovative framework designed for scalable streaming video narration.

FlowNar’s foundation lies in a dynamic context management strategy that eliminates historical visual context, paired with our novel CLAM (Cross Linear Attentive Memory) module. This module is specifically tailored to retain visual history during streaming, thereby guaranteeing bounded visual memory usage and maintaining constant computational complexity—factors essential for efficient streaming operations. Furthermore, we propose a realistic self-conditioned evaluation protocol alongside complementary metrics to rigorously assess streaming narration models under conditions that mirror real-world deployment.

Our experiments, conducted on the Ego4D, EgoExo4D, and EpicKitchens100 datasets, reveal that FlowNar significantly enhances narration quality compared to robust baseline models. Simultaneously, it delivers high efficiency, capable of processing videos ten times longer and achieving a threefold increase in throughput (FPS). The source code is accessible at https://github.com/zeyun-zhong/FlowNar.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...