arXiv

MAEPose: Self-Supervised Spatiotemporal Learning for Human Pose Estimation on mmWave Video

Title: MAEPose: Enabling Self-Supervised Spatiotemporal Learning for Human Pose Estimation via mmWave Video

Abstract

Millimeter-wave (mmWave) radar technology presents a compelling, privacy-conscious alternative to traditional RGB-based methods for human pose estimation. Despite this potential, current approaches predominantly depend on intermediate representations—such as spectrogram images or sparse point clouds—that are extracted beforehand. This preprocessing discards the rich spatiotemporal data inherent in raw radar video streams and introduces unnecessary system complexity. Furthermore, existing frameworks are largely constrained to end-to-end supervised learning, failing to utilize unlabelled raw video streams for acquiring generalized representations.

To address these limitations, we introduce MAEPose, a novel human pose estimation framework grounded in masked autoencoding that processes mmWave spectrogram videos directly. By learning motion-aware, generalized spatiotemporal representations from unlabelled radar video, MAEPose employs a heatmap decoder to generate multi-frame pose estimation predictions. Our evaluation, conducted across three datasets using leave-one-person-out cross-validation and rigorous statistical testing, demonstrates that MAEPose consistently surpasses state-of-the-art baselines, achieving performance improvements of up to 22.1% in Mean Per Joint Position Error (MPJPE) with statistical significance (p<0.05). The model also exhibits strong robustness, maintaining accuracy during zero-shot bystander interference with a mere 6.5% increase in error. Ablation studies highlight the critical contributions of both the pre-training phase and the heatmap decoder. Additionally, modality analysis reveals that utilizing Range-Doppler video as input yields superior pose estimation performance compared to Range-Azimuth data or their fusion, all while incurring lower computational costs.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&amp;D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&amp;D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...