Policy-based Foveated Imaging and Perception
Title: Policy-Driven Foveated Imaging and Perception
Abstract
While ultra-high-resolution image sensors are capable of capturing the intricate spatial details essential for numerous visual perception applications, processing every pixel at full resolution is frequently impractical due to limitations in power, latency, and bandwidth. Current solutions typically rely on spatial or temporal downsampling during acquisition; however, these methods permanently lose data before the system can determine its relevance to the specific task. To overcome this, we present a real-time, predictive, and task-aware foveated imaging framework that functions directly at the point of data capture. By utilizing emerging dual-stream sensor technologies, our technique dynamically distributes constrained pixel bandwidth toward regions of interest that are pertinent to the task, all while preserving a low-resolution overview of the global scene. We model this foveated acquisition process as a sensor attention policy-learning challenge, where historical observations inform decisions that shape future measurements, thereby establishing a closed loop between perception and acquisition. Our simulations across various perception tasks show that this method delivers high performance even under tight pixel constraints, significantly surpassing comparable baselines with identical bandwidth limits. Finally, we validate the system on a 200-megapixel dual-stream sensor, successfully recording real-world video under realistic operational constraints, which confirms the practical viability of task-oriented, acquisition-time foveated imaging.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





