Automated Lexical Coverage for Language Learning: From General to Specialized Word Lists
Title: Automating Lexical Coverage in Language Education: Transitioning from Broad to Niche Vocabulary Lists
Abstract: The General Service List (GSL) serves as a standard reference for language students seeking to master essential English vocabulary. Historically, developing these lists has been a laborious endeavor, dependent heavily on linguistic specialists and subjective judgment. In this study, we developed a proprietary GSL and benchmarked its efficacy against the New General Service List (NGSL). Our analysis demonstrates that generating a Specialized Word List (SWL)—one customized to a specific source text—offers a highly effective strategy for language acquisition. Since an SWL is extracted directly from the material being studied, it inherently achieves the 95% lexical coverage necessary for comprehension, doing so with a significantly smaller vocabulary size than a general-purpose list applied to the same content. In tests involving nine diverse texts, including academic articles, fiction, and screenplays, the NGSL achieved only 64–85% coverage. In contrast, text-specific lists attained the 95% threshold using far fewer words. By limiting the SWL development process to objective metrics, the methodology can be fully automated, scaled, and customized to support language learners worldwide.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




