Assessment of Generative Named Entity Recognition in the Era of Large Language Models
Title: Evaluating Generative Named Entity Recognition in the Age of Large Language Models
Abstract:
With the emergence of large language models (LLMs), named entity recognition (NER) is shifting from a conventional sequence labeling problem to a generative framework. This study presents a comprehensive assessment of open-source LLMs applied to both flat and nested NER challenges. We address several key inquiries, such as the disparity in performance between generative NER and traditional approaches, the influence of output structuring, the extent to which LLMs depend on memorization, and whether fine-tuning compromises their general competencies.
By testing eight LLMs of different sizes against four established NER benchmarks, we observed the following: (1) When utilizing parameter-efficient fine-tuning alongside structured output formats—such as inline bracketed notation or XML—open-source LLMs match the performance of traditional encoder-based models and outperform decoder-based LLMs that rely on in-context learning. (2) The ability of LLMs to handle NER tasks is driven by their instruction-following and generative strengths rather than simple memorization of entity-label associations. (3) Implementing instruction tuning for NER has a negligible negative effect on the general capabilities of LLMs; in fact, it can boost performance on specific datasets like DROP by 25.50 to 45.32 F1 points, likely due to improved entity comprehension.
Our results indicate that generative NER using LLMs offers a viable and accessible alternative to conventional methods. The associated code and data are available at https://github.com/szu-tera/LLMs4NER.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





