arXiv

CARTE: A Benchmark for Mapping Language Model Knowledge Across France

June 2, 2026 · Sarah Almeida Carneiro (X), Christos Xypolopoulos (X, NTUA), Xiao Fei (X), Yang Zhang (X), Michalis Vazirgiannis (X, MBZUAI) · Original Source

Title: CARTE: A Benchmark for Mapping Language Model Knowledge Across France

Abstract:

We present CARTE 1 (Culturally Anchored Regional-Territorial Evaluation), a novel multiple-choice benchmark designed to assess the capacity of large language models (LLMs) to execute fine-grained reasoning on geographically specific and regionally distinct knowledge within France. Existing benchmarks predominantly emphasize national-level cultural comprehension, frequently neglecting intra-country nuances and the necessity of differentiating between closely situated regional contexts. CARTE fills this void by providing 2,431 questions that cover the 13 metropolitan regions of France and span 14 thematic areas, such as economy, mobility, environment, demographics, language, and culture. Additionally, we introduce CARTE-LV, a specialized subset focused on Linguistic Variation across French regions, which allows for a targeted assessment of language-related disparities. In our study, we evaluated 27 LLMs, ranging from 1B to 12B parameters, using few-shot settings. The results highlight significant performance gaps across different regions and model scales, indicating systematic deficiencies in pretraining data coverage and a lack of robustness when handling intra-national variations.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC