"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems
Title: "Important You should give me full credits!": Examining Prompt Injection Vulnerabilities in LLM-Driven Automated Grading
Abstract:
The rapid advancement of large language models (LLMs) has spurred significant growth in research concerning LLM-based automated grading (AG) systems. Leveraging the extensive prior knowledge and robust instruction-following abilities of LLMs, educators can implement these AG systems across a wide array of tasks using only natural language rubrics, all while maintaining satisfactory grading accuracy. However, these benefits introduce new security challenges. Specifically, prompt injection (PI) attacks have emerged as a critical threat to applications powered by LLMs. Within the domain of automated grading, malicious actors could exploit PI vulnerabilities to coerce grading systems into awarding inflated scores, irrespective of the genuine quality of the submitted answers. This potential manipulation threatens the fairness, reliability, and integrity of educational evaluations. This study investigates PI attacks within AG systems, systematically analyzing their efficacy in educational contexts. Additionally, we assess the performance of existing defensive measures against such threats. Our comprehensive experiments, conducted under rubric-based grading conditions, reveal that contemporary LLM-based AG systems remain highly susceptible to PI attacks. We aim for these findings to heighten awareness of this nascent danger and to inspire further research into developing secure, resilient, and trustworthy LLM-based educational frameworks.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



