arXiv

Ekka: Automated Diagnosis of Silent Errors in LLM Inference

June 4, 2026 · Yile Gu, Zhen Zhang, Shaowei Zhu, Xinwei Fu, Jun Wu, Yida Wang, Baris Kasikci · Original Source

Title: Ekka: Automated Detection of Silent Errors in LLM Inference

Abstract:

Large Language Model (LLM) serving frameworks are undergoing rapid evolution, characterized by increasingly complex software stacks and a wide array of optimization techniques. This accelerated development cycle often introduces silent errors—issues where output quality degrades without triggering any explicit error signals. Diagnosing these problems is notoriously challenging due to the significant semantic gap that exists between high-level symptoms and their low-level root causes.

We demonstrate that diagnosing silent errors can be effectively addressed as a differential debugging problem by utilizing semantically correct reference implementations. To this end, we introduce Ekka, an automated diagnosis system designed to pinpoint root causes by systematically aligning and comparing intermediate execution states between a target framework and a reference one.

We developed a benchmark comprising real-world silent errors found in popular serving frameworks. On this benchmark, Ekka achieved a pass@1 diagnosis accuracy of 80% and a pass@5 accuracy of 88%, surpassing state-of-the-art systems. Furthermore, Ekka successfully identified four previously unknown silent errors in serving frameworks, all of which were subsequently confirmed by the respective developers.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC