Med-Banana: Learning Quality-Controlled Medical Image Editing from Success-and-Failure Trajectories
Title: Med-Banana: Acquiring Quality-Controlled Medical Image Editing Skills via Success-and-Failure Trajectories
Abstract:
When performing text-guided editing on medical images, it is imperative to fulfill the specified pathological modifications while maintaining anatomical integrity, modality-specific visual characteristics, and clinical realism. Currently, most available datasets train editing models exclusively on final, accepted edits, discarding the unsuccessful attempts generated during the process. We contend that these failures are crucial for quality control, as they offer essential supervision by defining what must be rejected, explaining the medical or visual invalidity of an edit, and indicating how instructions should be adjusted. To address this, we introduce Med-Banana, a framework for quality-controlled medical image editing that leverages trajectory supervision. We also release Med-Banana-80K, a comprehensive dataset comprising success-and-failure editing trajectories that include candidate images, verification results, reasons for rejection, and refined prompts. Utilizing this resource, Med-Banana simultaneously trains an editor, a verifier, and a refiner, facilitating an edit-verify-refine inference process that incorporates both accepted and rejected attempts. Our experiments, which employ MLLM judges, blind expert evaluations, and tests for source preservation and real-synthetic separability, show consistent performance gains over existing open-source medical image editors. Both the code and data are publicly accessible.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC






