arXiv

EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams

Title: EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams

Abstract: Introducing EuraGovExam, a novel benchmark derived from authentic civil service examinations across five key Eurasian jurisdictions: the European Union, India, Japan, South Korea, and Taiwan. This resource captures the genuine intricacy of public-sector testing, comprising more than 8,000 high-resolution scanned multiple-choice questions spanning 17 distinct academic and administrative fields. In a departure from conventional benchmarks, EuraGovExam consolidates all question components—including problem descriptions, answer options, and visual cues—into single images, offering only a brief, standardized prompt regarding answer formatting. This architecture requires models to execute layout-sensitive, cross-lingual reasoning directly from visual data. Sourced exclusively from actual examination papers, the dataset retains complex visual features such as tables, multilingual typography, and form-based structures. Our evaluations reveal that even leading vision-language models (VLMs) attain merely 86% accuracy, highlighting both the benchmark’s rigor and its utility in exposing current model shortcomings. By prioritizing cultural authenticity, visual intricacy, and linguistic variety, EuraGovExam sets a fresh benchmark for assessing VLMs in high-stakes, image-based, multilingual contexts, while also facilitating practical uses in e-governance, public document analysis, and fair exam preparation.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...