Symmetry-Aware 9D Pose Estimation with Sim(3)-Consistent Feature and Spherical Inception Convolution
Title: Symmetry-Aware 9D Pose Estimation via Sim(3)-Consistent Features and Spherical Inception Convolution
Abstract:
Accurate object pose estimation is essential for robotic agents to perceive and manipulate items within visual data. While instance-level approaches often fail to generalize to novel objects, category-level methods offer a solution but are frequently hindered by the intricacies of the non-linear Sim(3) space and significant intra-class variations. To overcome these limitations, we introduce a novel category-level object pose estimation framework driven by two primary advancements. First, we present a translation and size estimator incorporating a semantic-guided, symmetry-aware module. By capitalizing on the strong generalization power of large vision models (LVMs) to identify symmetry points, this component accurately determines translation and dimensions without relying on shape priors. This output acts as a precomputed reference for rotation, thereby simplifying the learning process within the complex Sim(3) space and establishing a solid basis for the more difficult task of rotation estimation. Second, we propose a feature fusion mechanism utilizing our novel spherical large-kernel inception convolution. This module integrates semantic features from LVMs with systematically derived geometric features, effectively extracting critical pose attributes amidst intra-class variations. It captures long-range dependencies efficiently, avoiding excessive computational overhead. Leveraging these innovations, our approach achieves state-of-the-art performance on standard benchmarks and in real-world scenarios, enabling the development of a resilient robotic picking system capable of managing a wide variety of objects. The source code will be released at the project page: {\hypersetup{urlcolor=blue}https://panfei-cheng.github.io/SSH-Pose}.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





