arXiv

SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction

June 3, 2026 · Dan Jacobellis, Neeraja J. Yadwadkar · Original Source

Title: SEAOTTER: Leveraging Sensor Embedded Autoencoding and One-Time Transcoding for Streamlined Reconstruction

Abstract:

Robotic platforms equipped with inexpensive, energy-efficient hardware can readily acquire high-resolution visual data. However, the utility of this data is often stifled by constrained on-device processing power and narrow bandwidth, which hinder the effectiveness of traditional compression standards such as JPEG and MPEG. Although next-generation codecs like AV1 and AVIF offer superior rate-distortion performance, their computational intensity makes them unfeasible for deployment without specialized application-specific integrated circuits (ASICs). While recent asymmetric autoencoder architectures achieve high-fidelity results under strict power and bandwidth limitations, they impose excessive decoding burdens and rely on proprietary formats, thereby disregarding the extensive infrastructure developed around established standards like JPEG.

To overcome these challenges, we present SEAOTTER (Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction), a compression framework tailored for cloud robotics. Recognizing that the sensor, cloud, and consumer endpoints operate under vastly different resource constraints, SEAOTTER merges the data efficiency of learned latent representations with the widespread compatibility of standard JPEG files. We address the performance degradation typically associated with naive transcoding by introducing a learnable transform for JPEG color spaces and quantization. This approach significantly enhances the accuracy of global, dense, and vision-language-based perception tasks.

By training both general-purpose and task-specific transcoding pipelines on a pre-trained, frozen encoder, SEAOTTER achieves substantial efficiency gains. When compared to AVIF at a 200:1 compression ratio, our method delivers encoding speeds seven times faster and decoding speeds 3.5 times faster, while also improving ImageNet top-1 accuracy by 8%. Crucially, these performance boosts are achieved without sacrificing compatibility with existing JPEG infrastructure. The source code is publicly accessible at https://github.com/UT-SysML/seaotter.

Source: arXiv Generated at: 2026-06-03 00:00:00 UTC