RobotValues: Evaluating Household Robots When Human Values Conflict
Title: RobotValues: Assessing Household Robots Amidst Conflicting Human Values
Abstract:
Although household robots are typically assessed by their ability to complete tasks, real-world domestic settings frequently present situations where multiple human values compete. In these contexts, robots are expected to make decisions that prioritize principles such as social appropriateness, efficiency, or human autonomy over mere task success. However, current research lacks benchmarks to evaluate how robots navigate these value-based dilemmas. To address this gap, we present RobotValues, a new benchmark designed to test household robot planners across 10,000 scenarios involving conflicting values. Each scenario features a realistic household image alongside several plausible robotic actions, each aligned with different human values.
We developed RobotValues using a pipeline that includes LLM-assisted scenario creation, value extraction grounded in stakeholder perspectives, image synthesis, and automated quality assurance. Our evaluation of Vision-Language Models (VLMs) employed in robotics reveals that these models possess inherent value biases, favoring safety and accommodation while neglecting privacy-preserving choices. Furthermore, when prompted to prioritize specific values that clash with their default inclinations, the models largely failed to adjust their behavior. In 80% of these instances, they selected incorrect actions, unable to override their initial preferences. These results indicate that evaluating household robots requires more than just measuring task completion or safety adherence; it must also assess the robot’s capacity to make nuanced decisions when human values are in conflict.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



