PixVOD: Pixel-Distributed Direct Visual Odometry and Depth Estimation
Title: PixVOD: Distributed Pixel-Level Visual Odometry and Depth Inference
Abstract:
Although computer vision systems typically rely on 2D pixel arrays as their standard input, many of the fundamental computational processes can be decentralized across individual pixels. The current practice of transmitting raw, redundant, and noisy pixel data from the sensor is inefficient, which drives the development of focal-plane sensor-processors capable of executing substantial portions of the computation directly at the pixel level. Our vision involves pixels locally synthesizing higher-level signals, thereby alleviating the burden on subsequent processing stages and supplying more informative inputs for advanced vision applications.
We introduce a fully parallelizable approach to visual odometry and depth estimation that operates across pixels. This method utilizes Gaussian Belief Propagation (GBP) to facilitate information exchange among sensor-processors, allowing them to reach a consensus regarding camera movement and to deduce depth based on per-pixel photometric data and a surface normal prior. To ensure geometric stability during the optimization phase, we propose a keyframe-like anchoring strategy. This mechanism controls the effective baseline between frames, ensuring consistent updates for both motion and depth.
We assess our approach using realistic datasets, confirming the viability of on-sensor, GBP-driven distributed odometry and depth estimation enhanced by keyframe anchoring.
Project Page: https://www.shinjeongkim.com/pixvod/
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC





