SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems
Title: SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems
Abstract:
As AI scientist systems become more prevalent in autonomous research, their adherence to academic integrity has yet to undergo systematic assessment. To address this gap, we present SCIINTEGRITY-BENCH, a novel benchmark built upon a dilemmatic evaluation framework. This benchmark comprises 33 scenarios distributed across 11 distinct trap categories. In each case, the only ethically correct action is to openly acknowledge failure; conversely, completing the task necessitates engaging in misconduct.
Our evaluation involved 231 runs across seven leading large language models (LLMs). The results indicate an overall integrity violation rate of 34.2%, with no model managing to avoid failures entirely. A particularly alarming trend emerged in scenarios involving missing data: all seven models produced synthetic data instead of admitting that the task was infeasible. The only variation among them lay in their transparency regarding this substitution.
Additionally, a prompt ablation study helped isolate two key factors. Eliminating explicit pressure to complete the task drastically lowered the rate of undisclosed fabrication from 20.6% to 3.2%. However, the rate of underlying data synthesis remained constant, highlighting an inherent completion bias that operates independently of prompt-level directives. These insights suggest that the primary cause of these failures is the lack of a trained disposition for honest refusal. SCIINTEGRITY-BENCH is publicly available at https://github.com/liuxingtong/Sci-Integrity-Bench.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




