arXiv

Revisiting Vul-RAG: Reproducibility and Replicability of RAG-based Vulnerability Detection with Open-Weight Models

Title: Re-examining Vul-RAG: Assessing the Reproducibility and Replicability of Open-Weight Models in RAG-Based Vulnerability Detection

Abstract:

While large language models (LLMs) demonstrate significant promise for automating software vulnerability detection—especially within retrieval-augmented generation (RAG) frameworks—the reproducibility and replicability of methods dependent on proprietary models and APIs have received little attention. This gap raises critical questions about whether previously reported outcomes are generalizable or merely artifacts of specific model selections. To address this, we conduct a reproducibility study of Vul-RAG, a RAG-based framework designed to augment LLMs with high-level vulnerability knowledge for source code analysis.

Our study first replicates Vul-RAG’s original results using reported open-weight baseline models in a fully local environment. We subsequently broaden the scope by evaluating a diverse array of recent open-weight LLMs, encompassing code-specialized, general-purpose, and reasoning models across various parameter sizes. Our findings confirm that Vul-RAG’s results are reproducible in local deployments, albeit with slight deviations.

Across all tested models, performance stabilizes at a pairwise accuracy of approximately 0.30, defined as the correct classification of both vulnerable and patched functions within a code pair. Importantly, this performance ceiling remains consistent even when employing newer, more advanced models, suggesting that increasing model capacity alone yields limited gains in detection effectiveness. The paper concludes by discussing the practical implications and trade-offs involved in balancing detection efficacy, model capabilities, and model scale. All implementation and evaluation artifacts are publicly accessible at https://github.com/hs-esslingen-it-security/revisiting-Vul-RAG.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia
Bloomberg

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia

Cerebras confirmed partnerships with all major AI hardware vendors except Nvidia. This broad engagement positions Cerebr...

Putin Turns Russia’s AI Future Into a Kremlin Family Business
Bloomberg

Putin Turns Russia’s AI Future Into a Kremlin Family Business

Putin is consolidating Russia’s AI ambitions into a Kremlin family business, effectively turning the sector into a dynas...

Reuters

Meta repeatedly pushes back new AI model release for developers, WSJ says

Meta has repeatedly delayed the release of its new AI model for developers, according to the WSJ. This ongoing postponem...