Dual-Route Top-K Retrieval with 1v1 VLM Reranking for the CoVR-R
Title: Dual-Route Top-K Retrieval with 1v1 VLM Reranking for the CoVR-R
Abstract:
This paper outlines our approach to the CoVR-R challenge, titled Dual-Route Top-K Retrieval with 1v1 VLM Reranking. Our methodology frames composed video retrieval as two interrelated tasks: first, assembling a comprehensive top-k pool of candidates, and second, determining with caution whether any specific candidate warrants displacing the current top-ranked result. Initially, we enhance the reasoning and text seed by employing a VLM slot selector on existing candidates, deliberately avoiding the integration of DFN visual retrieval at this stage. Subsequently, we incorporate a visual pathway utilizing DFN-H and DFN-L through contact-sheet embeddings. These two routes are combined to form a top-10 candidate list. A VLM-based final reranker then executes conservative one-on-one comparisons between the leading candidate and each potential challenger. Evaluated on the hidden test split, our final system achieved performance metrics of 95.28 for R@1, 97.47 for R@5, 98.48 for R@10, and 99.66 for R@50. The primary insight from this work is that the CoVR-R challenge yields greater improvements through the decoupling of recall and selection processes rather than through extensive text reranking or direct multi-candidate VLM classification.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





