If LLMs Have Human-Like Attributes, Then So Does Age of Empires II
Title: Age of Empires II Exhibits Human-Like Traits Just as Large Language Models Do
Original: arXiv:2605.31514v2 Announce Type: replace-cross Abstract: Much research has been carried out on large language models (LLMs) and LLM-powered agentic workflows. However, many works within the field state emergence of, ascribe to, or assume, generalised anthropomorphic attributes to them (e.g., morality or understanding of natural language). Our goal is not to argue in favour or against the existence of these attributes, but to point out that these conclusions could be incorrect. For this we build and train a simple neural network on the videogame Age of Empires II, and note that any entity in a sufficiently-powerful substrate, such as LEGO or the Greater Boston Area, could also present such attributes. Hence, the purported anthropomorphic attributes of LLMs are empirically non-unique: although some properties (e.g., responses to prompts) could remain constant, others, such as the interpretation of their perceived behaviour, might change with the substrate. Thus, any empirically-grounded discussion requires explicit measurement criteria; otherwise the interpretation is left to the representation. We then show that assuming that these attributes exist or not in a system, independent of the substrate and in a generalised way, leads to either circular or uninformative conclusions, regardless of the experimenter's viewpoint on the subject. Finally we propose a 'null' assumption, where one assumes LLM non-uniqueness instead of assuming anthropomorphic attributes to set up an experiment, along with examples of it. We also discuss potential objections to our work, briefly survey the field, and prove that Age of Empires II is functionally- and Turing-complete.
Rewritten: arXiv:2605.31514v2 Announcement Type: replace-cross
Abstract: While significant effort has been dedicated to studying large language models (LLMs) and workflows driven by them, numerous studies in the domain attribute, presume, or suggest the emergence of broad anthropomorphic characteristics in these systems, such as moral reasoning or natural language comprehension. Rather than debating whether these traits genuinely exist, this paper argues that such attributions may be flawed. To demonstrate this, we developed and trained a straightforward neural network within the video game Age of Empires II, highlighting that any sufficiently complex substrate—including constructs like LEGO sets or the Greater Boston metropolitan area—could similarly exhibit these qualities. Consequently, the alleged anthropomorphic features of LLMs are not empirically distinct. While certain behaviors, like prompt responses, might stay consistent, the interpretation of observed actions is heavily influenced by the underlying substrate. This implies that meaningful, evidence-based discourse must rely on precise measurement standards; without them, interpretation remains subjective to how the system is represented. We further illustrate that positing the presence or absence of these attributes in a system, detached from its specific substrate and applied generally, results in either circular logic or meaningless conclusions, irrespective of the researcher’s stance. Finally, we introduce a 'null' hypothesis that assumes LLM non-uniqueness rather than anthropomorphic traits when designing experiments, providing illustrative examples. The paper also addresses likely counterarguments, offers a concise review of the current literature, and establishes that Age of Empires II is both functionally and Turing-complete.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



