arXiv

Re-Ranking Through an Attribution Lens for Citation Quality in Legal QA

June 3, 2026 · Mohamed Hesham Elganayni, Selim Saleh · Original Source

Title: Enhancing Legal QA Citation Quality via Attribution-Based Re-Ranking

Abstract

In legal question-answering systems that utilize retrieval-augmented generation, passages are traditionally retrieved based on semantic similarity and subsequently fed to a language model to generate responses with citations. Existing literature generally posits that top-ranked passages are the most probable candidates for useful citation. While perturbation-based attribution techniques, including C-LIME, have been employed solely for post-hoc interpretability, our analysis of the AQuAECHR benchmark reveals a disconnect between semantic similarity and passage attribution. Specifically, ranking by similarity within a retriever’s candidate pool yields performance inferior to random selection when attempting to surface the correct gold citation paragraphs.

To overcome this deficiency, we train a lightweight cross-encoder on continuous attribution scores derived from perturbation methods to re-rank passages before the generation phase. We assess this methodology on the AQuAECHR benchmark, employing two distinct language models and five-fold cross-validation. The results indicate that this re-ranking strategy significantly enhances both the faithfulness of citations and their alignment with expert-provided gold answers. Notably, two re-rankers trained independently on different models achieve convergence that exceeds their initial raw attribution agreement. This suggests that the cross-encoder effectively mitigates model-specific noise, generating a shared relevance signal with partial transferability across models, despite same-model re-ranking remaining the more effective approach. These findings highlight the utility of perturbation-based attribution as a practical, model-agnostic training signal for citation-aware retrieval.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC