arXiv

SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation

Title: SafeGen-Bench: Evaluating Safety Protocols in Image-Conditioned Text-to-Video Synthesis

Abstract:

As text-to-image diffusion technologies advance rapidly, generative video models—such as Sora—have become capable of producing brief synthetic clips driven by either text prompts or initial images. However, this capability, particularly when conditioned on an input image, introduces significant safety concerns, including the potential for generating illegal, politically sensitive, or unethical material. While current benchmarks have begun addressing video safety, they predominantly focus on malicious text inputs, overlooking the complex scenario where the combination of text and image prompts yields harmful outcomes. This represents a prevalent yet difficult challenge: even when both the text and image inputs appear benign, the resulting video may still propagate dangerous information.

To address this oversight, we present SafeGen-Bench, a specialized benchmark tailored for assessing the safety of conditional Text-to-Video (T2V) models. This framework identifies ten distinct categories of malicious content, with a specific emphasis on risks associated with temporal dynamics and portrayed actions. SafeGen-Bench utilizes a curated dataset of start frames sourced from various images and videos, matched with corresponding text prompts to replicate real-world usage scenarios.

Our evaluation of multiple conditional T2V models using SafeGen-Bench reveals that existing systems struggle to reliably prevent the generation of harmful material. Unsafety scores climbed as high as 44.5, with performance notably degrading under quality-intensive conditions. Additionally, we tested the efficacy of both text-centric and image-centric guardrails. The findings indicate that relying on unimodal safeguards is inadequate for robust protection, resulting in an 80% failure rate across seven of the defined malicious categories. We anticipate that SafeGen-Bench will contribute to the advancement of more secure and controllable conditional T2V technologies.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...