Label Over Logic? How Source Cues Bias Human Fallacy Judgments More Than LLMs
Title: Label Over Logic? How Source Cues Bias Human Fallacy Judgments More Than LLMs
As the digital landscape becomes saturated with AI-generated and AI-assisted material, the labels attached to such content can skew human assessments of reasoning, leading to significant repercussions for moderation, evaluation, and decision-making processes. It remains unclear whether Large Language Models (LLMs) are similarly vulnerable to these biases or if they provide a more source-agnostic form of evaluation—a question that holds direct relevance for human-AI collaboration.
To investigate this, the study employs logical fallacies as a controlled environment, allowing researchers to isolate the impact of source labels on reasoning quality without the confounding factor of domain-specific knowledge. The research involved an online experiment with 505 participants, who were distributed across five source conditions: human-authored, AI-authored, human-authored with AI assistance, AI-authored with human assistance, and no disclosure. Participants evaluated comments containing logical fallacies, and their judgments were compared against those of LLMs (specifically GPT-5.2, Gemini 2.5 Flash, and Claude Sonnet 4.5), which were subjected to the same source labeling conditions.
The results revealed that human evaluators were notably more prone to accepting fallacies when they were labeled as being written by a human or by a human with AI assistance. In these scenarios, participants assigned higher trust and evaluation ratings. In contrast, LLM evaluations remained relatively consistent across different source labels, although performance metrics differed among the various models. Notably, confidence levels were high for both humans and LLMs across all conditions, irrespective of whether fallacies were present.
These findings suggest that bias stemming from source labels in reasoning evaluation is primarily a human weakness. This highlights the potential value of collaborating with LLMs in environments increasingly mediated by AI, where human judgment might otherwise be compromised by labeling effects.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






