Implicit Geographic Inference in LLM Medical Triage: Language-Driven Disparities in Emergency Recommendations
Title: Language-Induced Geographic Bias in LLM Emergency Triage: How Linguistic Cues Drive Disparate Medical Advice
Abstract
This study examines whether large language models (LLMs) generate divergent medical triage outcomes for the same clinical presentation, depending exclusively on the language used in the patient’s input. Utilizing the Gemini 3.5 Flash model, we analyzed a standardized neurological symptom profile—characterized by persistent headaches, blurred vision, and nausea—across six distinct languages: English, Spanish, Chinese, Hindi, Japanese, and Arabic. The experimental design involved 30 independent runs for each language condition, totaling 450 API calls.
Our analysis reveals significant disparities in emergency department (ED) referral rates. While the model assigned nearly uniform severity scores ranging from 7.7 to 8.0 out of 10 across all linguistic inputs, the recommendation for an ER visit varied drastically. Specifically, ER recommendations ranged from 0% for Japanese and Hindi prompts to 30% for English and Arabic prompts.
To investigate the source of these discrepancies, we introduced geographic anchors. Adding a single sentence indicating the patient’s location in the US increased ER recommendations by up to 76.7 percentage points for non-English prompts. Conversely, anchoring an English prompt to a Tokyo location reduced the ER recommendation rate from 30% to 6.7%. A control experiment involving back-translation from Japanese to English yielded ER rates similar to the English baseline, confirming that the observed inequalities stem from implicit geographic inference based on the input language rather than deficiencies in translation quality. We have made the full dataset, experimental code, and results publicly available.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




