RealityTest: How People Probe AI Identity and Whether Models Disclose It
Title: RealityTest: Investigating AI Identity Probing and Disclosure Practices
Abstract
As artificial intelligence systems become more prevalent in conversational interfaces, users often struggle to determine if they are interacting with a person or a machine. While regulators are increasingly concerned about this ambiguity, current assessments of AI disclosure tend to be limited to English, rely on algorithmically generated prompts, and focus exclusively on text-based interactions. To address these gaps, we introduce RealityTest, a comprehensive benchmark designed to evaluate how well AI systems reveal their identity when queried. This represents the first large-scale, multimodal, and multilingual assessment rooted in real-world human data regarding how individuals actually encounter and question AI identity.
In conjunction with the benchmark, we are releasing a dataset comprising 3,152 identity-probing queries gathered from approximately 750 participants across 49 countries and five languages, covering both text and speech modalities. Our analysis reveals that only 31% of users directly ask about identity in ambiguous situations. Furthermore, the diversity of human-generated questions significantly exceeds that of machine-generated ones.
We evaluated 17 text-based and 6 speech-based models, observing significant differences in their disclosure behaviors. Notably, applying a single suppression instruction caused disclosure rates to drop below 30% across even the top-performing models. Our findings validate the importance of diverse, human-grounded evaluation data, showing that the phrasing of the question and the conversational context play a more critical role in disclosure outcomes than the specific model being tested. Consequently, safety evaluations relying on narrow or synthetic query sets may fail to accurately reflect model behavior in realistic deployment scenarios.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





