arXiv

Segment, Embed, and Align: A Universal Recipe for Aligning Subtitles to Signing

Title: Segment, Embed, and Align: A Universal Recipe for Aligning Subtitles to Signing

Abstract: This study aims to establish a generalized methodology for synchronizing subtitles—defined as spoken language text accompanied by precise timestamps—with continuous sign language video footage. Previous solutions have largely depended on end-to-end training models that are confined to specific languages or datasets, thereby restricting their broader applicability. To address this limitation, we introduce Segment, Embed, and Align (SEA), a unified framework capable of operating across diverse languages and domains. SEA utilizes two pre-trained models: the initial model segments video frame sequences into distinct signs, while the second embeds each corresponding sign video clip into a shared latent space alongside text representations. The alignment process is executed via a lightweight dynamic programming algorithm, enabling efficient CPU-based processing that completes within a minute, even for hour-long episodes. Demonstrating significant flexibility, the system adapts to various contexts, ranging from small lexicons to extensive continuous corpora. Evaluations across four sign language datasets reveal that SEA achieves state-of-the-art alignment results, underscoring its capacity to produce high-quality parallel data that can propel sign language processing research forward. Both the code and models for SEA are publicly accessible.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

TechCrunch

Meta’s Oversight Board says account bans lack due process, transparency

Meta’s Oversight Board criticized account bans for lacking due process and transparency, citing inconsistent enforcement...

TechCrunch

Meta rolls out a new AI creator assistant on Facebook

Meta launched an AI creator assistant on Facebook to streamline analytics and content brainstorming. Initially available...

TechCrunch

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

WWDC 2026 promises a Siri revamp powered by Google’s Gemini and standalone app, plus AI agents in the App Store and Came...

TechCrunch

A burglar used a Waymo to steal yoga clothes in San Francisco — and got away with it

A thief stole yoga clothes using a Waymo, but police failed to catch them because the car’s video data was deleted and b...

Goldman Sachs CEO David Solomon on the Coming Mega IPOs
Bloomberg

Goldman Sachs CEO David Solomon on the Coming Mega IPOs

Goldman Sachs CEO David Solomon anticipates a surge in major IPOs, signaling renewed market confidence and significant o...

What Are A.I. Agents Actually Doing?
New York Times

What Are A.I. Agents Actually Doing?

Arena research shows tech professionals are most likely to use AI agents at work, highlighting a strong industry trend i...