arXiv

BEAST3D: Animal behavioral analysis and neural encoding from multi-view video via Gaussian splatting

Title: BEAST3D: Leveraging Gaussian Splatting for Neural Encoding and Animal Behavior Analysis in Multi-View Video

Abstract:

While multi-view video has become a standard method for capturing the three-dimensional movements of animals in experimental contexts, deriving comprehensive 3D data from these recordings continues to pose significant technical hurdles. Traditional supervised pose estimation is hindered by the need for labor-intensive manual labeling, whereas off-the-shelf 3D reconstruction tools, typically trained on broad scene datasets, struggle with the unique imagery and limited viewpoints characteristic of laboratory environments. To overcome these obstacles, we introduce BEAST3D, a self-supervised pretraining framework designed to learn 3D visual representations directly from unlabeled, calibrated multi-view video.

BEAST3D employs a vision transformer to predict 3D Gaussian splats, which facilitate the reconstruction of unseen camera angles via differentiable rendering. Concurrently, the model isolates the animal from its surroundings through segmentation. By conditioning directly on established camera parameters, BEAST3D can reconstruct 3D structures using as few as four views. This approach contrasts sharply with general-purpose models, which rely on dense, overlapping viewpoints to estimate camera geometry—a condition rarely met in lab settings.

Our extensive evaluation across four distinct species confirms that BEAST3D generates rich, viewpoint-invariant features that transfer robustly to three key downstream applications: novel view synthesis, which serves as a benchmark for the fidelity of the learned 3D representations; multi-view pose estimation, which yields the sparse keypoint trajectories essential for behavioral analysis; and neural encoding, which correlates 3D behavioral metrics with concurrent neural activity data. Consequently, BEAST3D offers a flexible framework for behavioral analysis that capitalizes on the 3D structural information inherent in contemporary multi-view laboratory recordings.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TikTok Billionaire Tops Ambani as Asia’s Second-Richest
Bloomberg

TikTok Billionaire Tops Ambani as Asia’s Second-Richest

TikTok founder surpasses Mukesh Ambani to become Asia’s second-richest person, marking a significant shift in the region...

Publishers in UK can opt out of Google AI search results
BBC News

Publishers in UK can opt out of Google AI search results

UK publishers can now opt out of Google’s AI search summaries, a CMA ruling designed to boost their bargaining power and...

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.
Bloomberg

Kioxia Edges Nearer Toyota’s Market Cap in Shakeup to Japan Inc.

Kioxia’s market cap nears Toyota’s, signaling a major shift in Japan’s corporate hierarchy. This narrowing gap highlight...

Reuters

Morning Bid: Marvell, a fitting name for the latest AI darling

Reuters highlights Marvell as a top AI stock, noting its name perfectly suits its status as the newest market darling.

Financial Times

Tim Hayward: I built the Jaguar E-Type of computer keyboards

Tim Hayward compares his bespoke keyboard designs to the Jaguar E-Type. He explores high-end customization for personal ...

Financial Times

AI Labs: Zuckerberg’s $100bn gamble

Meta’s $100 billion AI investment aims to secure AI dominance, but questions remain whether sheer spending can outpace c...