arXiv

Paving the Way for Point Cloud Video Representation Learning Using A PDE Model

June 2, 2026 · Zhuoxu Huang, Zhenkun Fan, Jungong Han, Josef Kittler · Original Source

Title: Advancing Point Cloud Video Representation Learning via a PDE-Based Framework

Abstract: Understanding the spatial-temporal dynamics of point cloud videos—specifically how spatial points evolve over time—is essential for accurate analysis. However, traditional techniques, particularly those based on flow, often fail to capture these correlations effectively due to the inherently unordered nature of sequential point cloud data. To overcome this limitation, we introduce a novel methodology that frames spatial-temporal correlation learning as a solvable Partial Differential Equation (PDE). Although PDEs have proven robust in physical modeling, their potential for handling new forms of sequential data like point cloud videos has yet to be fully realized. Drawing inspiration from fluid dynamics, we develop a streamlined PDE model. The solution process is then optimized through a contrastive learning framework that aligns temporal embeddings with spatial embeddings. This additional supervision mechanism allows our proposed method, MotionPDE, to function as a highly efficient, plug-and-play enhancement for existing backbone architectures, introducing negligible computational cost or parameter bloat. By leveraging the contrastive learning objective, we further explore the self-supervised strengths of MotionPDE, achieving promising outcomes that highlight its effectiveness and versatility in interpreting point cloud video data. To support ongoing research, the source code and trained checkpoints will be publicly released at https://github.com/zhh6425/motionpde.git.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC