Position: Adversarial ML for LLMs Is Not Making Any Progress
Title: Adversarial Machine Learning for Large Language Models Is Stagnating
Abstract: Over the last ten years, the machine learning community has invested significant energy into fortifying models against adversarial environments. However, advancements have remained sluggish, even when addressing straightforward "toy" scenarios—such as resilience to minor adversarial perturbations—and are frequently obstructed by evaluations lacking rigor. Recently, the focus of adversarial ML research has pivoted toward examining expansive, general-purpose language models. This position paper contends that the current landscape is deteriorating: in the age of LLMs, the field grapples with issues that are (1) poorly defined, (2) more difficult to resolve, and (3) even harder to assess. Consequently, we warn that another ten years of efforts in adversarial ML risk yielding no substantial advancements.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



