Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
Title: Evaluating LLM Cultural Value Alignment Through a Distributional Lens Using a Value Codebook
As large language models (LLMs) are increasingly deployed on a global scale, ensuring their cultural value orientations are properly aligned has become essential for both safety and user engagement. However, current benchmarking methods struggle with the Construct-Composition-Context ($C^3$) challenge. These existing approaches typically rely on discriminative, multiple-choice formats that assess value knowledge rather than genuine orientations, ignore subcultural heterogeneity, and do not align with the nature of real-world open-ended generation.
To address these limitations, we present DOVE, a distributional evaluation framework that directly compares the distributions of human-written text against LLM-generated outputs. DOVE employs a rate-distortion variational optimization objective to build a compact value codebook from 10,000 documents. This process maps text into a structured value space, effectively filtering out semantic noise. To measure alignment, the framework utilizes unbalanced optimal transport, which captures intra-cultural distributional structures and subgroup diversity.
Experiments conducted across 12 different LLMs demonstrate that DOVE offers superior predictive validity, achieving a 31.56% correlation with downstream tasks. Furthermore, the framework maintains high reliability even with small sample sizes, requiring as few as 500 samples per culture.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





