arXiv

RoboBenchMart: Benchmarking Robots in Retail Environment

June 2, 2026 · Konstantin Soshin, Alexander Krapukhin, Andrei Spiridonov, Gregorii Bukhtuev, Andrey Kuznetsov, Vlad Shakhuro, Denis Shepelev · Original Source

Title: RoboBenchMart: Evaluating Robot Performance in Retail Settings

Abstract:

Current robotic manipulation benchmarks predominantly concentrate on tabletop or domestic contexts. Although these environments have spurred remarkable advancements, it remains uncertain whether generalist Vision-Language-Action (VLA) models that perform well in such settings can effectively transfer to domains characterized by distinct geometries, semantic meanings, and operational workflows. To address this gap, we present RoboBenchMart, an open-source simulated benchmark designed for retail "dark-store" environments. In this scenario, a mobile manipulator is tasked with executing complex manipulation operations involving a wide variety of grocery products.

This specific setting introduces substantial difficulties, such as dense object clutter and varied spatial arrangements, where items are located at varying heights, depths, and in close proximity to one another. By focusing on the retail sector, our benchmark targets a domain with high potential for immediate automation applications. Leveraging generated trajectories, we establish a standard, realistic fine-tuning protocol for contemporary generalist VLAs and assess the performance of several leading models. Our results reveal that these models continue to face challenges even with standard retail tasks, suggesting that they have not yet achieved true generalization across different domains. To facilitate further research, we are releasing the complete RoboBenchMart suite, which comprises a procedural store layout generator, a trajectory generation pipeline, evaluation tools, and fine-tuned baseline models.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC