UniCAD: A Unified Benchmark and Universal Model for Multi-Modal Multi-Task CAD
Title: UniCAD: Establishing a Unified Benchmark and Universal Model for Multi-Modal, Multi-Task CAD
Abstract:
Computer-Aided Design (CAD) serves as the foundation for contemporary engineering and manufacturing, facilitating the development of highly precise and editable 3D models. Despite this critical role, research in the field has traditionally focused on individual tasks in isolation. The advancement of multi-modal, multi-task learning for CAD has been significantly impeded by the lack of a standardized, unified benchmark. To bridge this divide, we present UniCAD, an extensive benchmark designed for multi-modal CAD learning. This benchmark encompasses a wide array of functionalities, including point-to-CAD reconstruction, text and image-to-CAD generation, and CAD-related question answering, accommodating various input modalities.
Complementing this benchmark, we introduce UniCAD-MLLM, a versatile multi-modal large language model. This model is capable of processing diverse inputs—such as text, images, sketches, and point clouds—and executing these heterogeneous tasks end-to-end within a single, cohesive framework. Our extensive experimental evaluations, conducted on both the UniCAD and Fusion360 benchmarks, reveal that UniCAD-MLLM delivers state-of-the-art results across all tested tasks. It consistently surpasses existing baselines, whether they are specialized for specific tasks or designed for multi-task scenarios. To foster further advancements in the field, we will make the dataset, source code, and pre-trained models publicly available.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC




