arXiv

Towards Lightweight Reliability: Using Soft Prompts for Hallucination Mitigation in Large Language Models

June 2, 2026 · S M Tahmid Siddiqui, Akib Jawad Ononto, Anoop Singhal, Latifur Khan · Original Source

Title: Enhancing Lightweight Reliability: Mitigating Hallucinations in Large Language Models via Soft Prompts

Abstract

While Large Language Models (LLMs) are increasingly integrated into diverse fields, their practical utility is often compromised by hallucinations—outputs that sound convincing yet lack factual accuracy. In critical sectors, such inaccuracies can erode user confidence and pose significant real-world dangers. To tackle this issue, we introduce a parameter-efficient strategy that leverages soft prompts to reduce hallucinated responses and encourage responsible abstention during generative Question-Answering (QA) tasks.

We propose a method named Responsible Contrastive Soft Prompting (RCSP), which employs a composite loss function to train soft prompts. This approach simultaneously targets three objectives: curbing hallucinatory content, fostering abstention when faced with uncertainty, and maintaining or enhancing factual recall. The training mechanism integrates contrastive loss, curriculum learning, and KL regularization to achieve these aims.

We assessed our methodology across five distinct generative QA datasets, utilizing an LLM-as-a-Judge framework for evaluation. Experiments conducted on Gemma 3 (12B) and Llama 3.1 (8B) backbones indicate that RCSP successfully balances factual recall with the suppression of hallucinations and appropriate abstention. Consequently, it achieves a generally higher F-score compared to standard baselines based on reasoning and instruction-based prompting. Importantly, these gains are realized by tuning only a small fraction of the parameters needed for other adjustment techniques. Our findings suggest that soft prompts offer a modular and computationally efficient avenue for enhancing the reliability of LLMs.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC