arXiv

On Out-of-sample Embedding in UMAP

June 4, 2026 · Mohammad Tariqul Islam, Jason W. Fleischer · Original Source

Title: Addressing Out-of-Sample Embedding Challenges in UMAP

Abstract: Neighbor embedding techniques elucidate relationships within high-dimensional datasets by generating corresponding graph structures in reduced-dimensional spaces. Among these methods, Uniform Manifold Approximation and Projection (UMAP) has gained significant traction, leveraging algebraic topology to align distance metrics across the original and projected spaces. Despite its effectiveness across various data types, UMAP struggles with integrating out-of-sample points into an already established mapping. Specifically, the algorithm tends to position new data points at the edges of existing clusters rather than embedding them within the cluster interiors alongside their correlated neighbors. This study addresses this "repulsion effect" by optimizing pairwise interactions within the initial k-nearest-neighbor graph. Furthermore, we demonstrate that parameterized UMAP yields superior embeddings compared to non-parametric alternatives, a distinction that becomes more pronounced as data complexity increases, such as in the case of medical imagery. Additionally, our findings indicate that employing a parameterized UMAP naturally alleviates the repulsion issue. We evaluate various UMAP methodologies by assessing trustworthiness, utilizing nearest neighbor classifiers, and examining the attractive and repulsive forces inherent in the resulting embeddings.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

TechCrunch

Meta’s Oversight Board says account bans lack due process, transparency

June 4, 2026

Meta’s Oversight Board criticized account bans for lacking due process and transparency, citing inconsistent enforcement...

TechCrunch

Meta rolls out a new AI creator assistant on Facebook

June 4, 2026

Meta launched an AI creator assistant on Facebook to streamline analytics and content brainstorming. Initially available...

TechCrunch

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

June 4, 2026

WWDC 2026 promises a Siri revamp powered by Google’s Gemini and standalone app, plus AI agents in the App Store and Came...

TechCrunch

A burglar used a Waymo to steal yoga clothes in San Francisco — and got away with it

June 4, 2026

A thief stole yoga clothes using a Waymo, but police failed to catch them because the car’s video data was deleted and b...

Bloomberg

Goldman Sachs CEO David Solomon on the Coming Mega IPOs

June 4, 2026

Goldman Sachs CEO David Solomon anticipates a surge in major IPOs, signaling renewed market confidence and significant o...

New York Times

What Are A.I. Agents Actually Doing?

June 4, 2026

Arena research shows tech professionals are most likely to use AI agents at work, highlighting a strong industry trend i...

Top international news

On Out-of-sample Embedding in UMAP

Related Articles

Meta’s Oversight Board says account bans lack due process, transparency

Meta rolls out a new AI creator assistant on Facebook

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

A burglar used a Waymo to steal yoga clothes in San Francisco — and got away with it

Goldman Sachs CEO David Solomon on the Coming Mega IPOs

What Are A.I. Agents Actually Doing?