A Data-Driven Approach to Idiomaticity Based on Experts' Criteria in Theoretical Linguistics
Title: Leveraging Expert Criteria for a Data-Driven Analysis of Idiomaticity in Theoretical Linguistics
Abstract:
This study presents a data-driven examination of 286 multi-word expressions (MWEs), evaluating them against 16 specific lexical, grammatical, and other criteria drawn from established theoretical literature on idiomaticity. Both the MWEs and the defining criteria were sourced from the same theoretical texts. A panel of linguistics experts annotated the expressions according to these categories. The resulting distribution of data indicates that no expressions are entirely idiomatic. The findings suggest that lexical criteria exert the strongest influence, while grammatical criteria apply under specific conditions. Furthermore, the presence of archaic vocabulary and certain grammatical structures significantly affect an MWE’s capacity to be substituted by a single-word equivalent.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





