RuleEdit: Failure-Guided Human-AI Model Editing with Prospective Impact Preview
Title: RuleEdit: A Human-AI Model Editing Framework Guided by Failure Detection and Prospective Impact Analysis
Abstract: Although artificial intelligence holds significant potential for aiding complex decision-making, practitioners currently face a gap in tools that can identify probable failures and allow for the inspection of model edits prior to deployment. To address this, we introduce RuleEdit, an interactive system that facilitates human-AI model editing through rule guidance. This system highlights potential failures using interpretable mismatch signals derived from rule tables and enables users to provide rule-based feedback, accompanied by prospective previews of expected performance shifts and embedding changes. We implemented RuleEdit within the domain of stroke rehabilitation assessment and conducted evaluations with both health professionals and students. The results indicate that rule-guided failure detection enhanced Human + AI performance by 14.16% (p<0.001). Furthermore, this approach improved the rejection of incorrect AI suggestions and mitigated issues related to over-reliance, under-reliance, and decisions that switched to incorrect outcomes. Additionally, the inclusion of prospective embedding previews significantly boosted the quality of user feedback for model adaptation. This led to an increase in post-update local performance gains from 11.50% to 36.38% following the integration of user-provided rule-based feedback (p<0.001). Our study demonstrates that mismatch-based failure indicators and prospective impact previews can effectively support failure-aware model editing in human-AI systems. However, the findings also uncover a local-global trade-off, where edits beneficial for specific instances may negatively impact global performance when applied broadly. We conclude by discussing the broader implications for designing controllable and failure-aware human-AI systems.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




