arXiv

Revisiting Vul-RAG: Reproducibility and Replicability of RAG-based Vulnerability Detection with Open-Weight Models

June 4, 2026 · Sabrina Kaniewski, Fabian Schmidt, Tobias Heer · Original Source

Title: Re-examining Vul-RAG: Assessing the Reproducibility and Replicability of Open-Weight Models in RAG-Based Vulnerability Detection

Abstract:

While large language models (LLMs) demonstrate significant promise for automating software vulnerability detection—especially within retrieval-augmented generation (RAG) frameworks—the reproducibility and replicability of methods dependent on proprietary models and APIs have received little attention. This gap raises critical questions about whether previously reported outcomes are generalizable or merely artifacts of specific model selections. To address this, we conduct a reproducibility study of Vul-RAG, a RAG-based framework designed to augment LLMs with high-level vulnerability knowledge for source code analysis.

Our study first replicates Vul-RAG’s original results using reported open-weight baseline models in a fully local environment. We subsequently broaden the scope by evaluating a diverse array of recent open-weight LLMs, encompassing code-specialized, general-purpose, and reasoning models across various parameter sizes. Our findings confirm that Vul-RAG’s results are reproducible in local deployments, albeit with slight deviations.

Across all tested models, performance stabilizes at a pairwise accuracy of approximately 0.30, defined as the correct classification of both vulnerable and patched functions within a code pair. Importantly, this performance ceiling remains consistent even when employing newer, more advanced models, suggesting that increasing model capacity alone yields limited gains in detection effectiveness. The paper concludes by discussing the practical implications and trade-offs involved in balancing detection efficacy, model capabilities, and model scale. All implementation and evaluation artifacts are publicly accessible at https://github.com/hs-esslingen-it-security/revisiting-Vul-RAG.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC