arXiv

SafeGen-Bench: Benchmarking Safety in Image-Conditioned Text-to-Video Generation

June 2, 2026 · Yingzi Ma, Xiaogeng Liu, Yawen Zheng, Chaowei Xiao · Original Source

Title: SafeGen-Bench: Evaluating Safety Protocols in Image-Conditioned Text-to-Video Synthesis

Abstract:

As text-to-image diffusion technologies advance rapidly, generative video models—such as Sora—have become capable of producing brief synthetic clips driven by either text prompts or initial images. However, this capability, particularly when conditioned on an input image, introduces significant safety concerns, including the potential for generating illegal, politically sensitive, or unethical material. While current benchmarks have begun addressing video safety, they predominantly focus on malicious text inputs, overlooking the complex scenario where the combination of text and image prompts yields harmful outcomes. This represents a prevalent yet difficult challenge: even when both the text and image inputs appear benign, the resulting video may still propagate dangerous information.

To address this oversight, we present SafeGen-Bench, a specialized benchmark tailored for assessing the safety of conditional Text-to-Video (T2V) models. This framework identifies ten distinct categories of malicious content, with a specific emphasis on risks associated with temporal dynamics and portrayed actions. SafeGen-Bench utilizes a curated dataset of start frames sourced from various images and videos, matched with corresponding text prompts to replicate real-world usage scenarios.

Our evaluation of multiple conditional T2V models using SafeGen-Bench reveals that existing systems struggle to reliably prevent the generation of harmful material. Unsafety scores climbed as high as 44.5, with performance notably degrading under quality-intensive conditions. Additionally, we tested the efficacy of both text-centric and image-centric guardrails. The findings indicate that relying on unimodal safeguards is inadequate for robust protection, resulting in an 80% failure rate across seven of the defined malicious categories. We anticipate that SafeGen-Bench will contribute to the advancement of more secure and controllable conditional T2V technologies.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC