arXiv

Scene-Centric Unsupervised Video Panoptic Segmentation

Title: Scene-Centric Unsupervised Video Panoptic Segmentation

Abstract:

Video panoptic segmentation (VPS) seeks to simultaneously detect, segment, and track every object within a video while dividing the footage into regions that are semantically coherent. In this work, we establish the framework for unsupervised VPS, a setting that requires no human-generated labels. While current research on unsupervised scene understanding has primarily concentrated on static image segmentation, the video domain has seen little attention. To address this gap, we present VideoCUPS, the inaugural method for unsupervised VPS. VideoCUPS creates temporally stable panoptic pseudo-labels by leveraging unsupervised signals such as depth, motion, and visual features derived from scene-centric videos. By training on these generated pseudo-labels with a newly designed Video DropLoss, we produce a highly accurate VPS model without supervision. To measure advancements in this area, we develop a thorough evaluation protocol alongside four robust baseline models, adapting leading unsupervised panoptic image and instance video segmentation techniques for VPS tasks. VideoCUPS surpasses all existing baselines and exhibits significant efficiency in label usage. Together with our evaluation standards and baseline comparisons, VideoCUPS lays a solid groundwork for subsequent investigations into unsupervised video panoptic segmentation.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Glazer Family Members Said to Study Manchester United Stake Sale
Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines
Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion
Bloomberg

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion

Nvidia-backed robotics startup Generalist AI has reached a $2 billion valuation. Founders Pete Florence, Andy Zeng, and ...

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...

Financial Times

How AI has de-skilled translation

AI fragments specialist translation into routine tasks, effectively de-skilling the profession. This shift reduces compl...

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...