Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features
Title: Leveraging Interpretable Linguistic Features for Robust Detection of AI-Generated Fake News Across Diverse Prompts
The proliferation of large language models has intensified anxieties regarding the dissemination of fabricated news created by AI, especially when different prompting techniques are employed. Current detection systems are typically developed and assessed within isolated generation environments, which obscures their capacity to perform effectively against novel, unseen prompts. To address this gap, our research examines how well detection models can generalize across different prompting strategies. We utilized three distinct datasets comprising AI-generated articles—each created under unique prompts—alongside authentic news reports.
We focused on extracting interpretable linguistic markers that reflect lexical variety, text readability, and emotional tone. These features were used to train a random forest classifier within a cross-prompt evaluation framework, where models trained on data from one prompt type were tested on data from another. The results demonstrated consistent, high-level accuracy across all six possible training and testing combinations, with Area Under the Curve (AUC) scores spanning from 0.988 to 1.000.
Further analysis of feature distributions revealed that AI-generated content tends to display higher lexical diversity and lower readability, along with significantly diminished emotional intensity relative to the broader dataset. While these characteristics varied depending on the specific prompt used, the classifier’s performance remained robust despite these distributional shifts. This stability suggests that the selected linguistic features capture inherent, consistent properties of AI-generated text that transcend specific prompting methods. Consequently, our findings indicate that approaches relying on these interpretable features offer a resilient solution for identifying AI-generated misinformation, even in the face of varying prompt inputs.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






