arXiv

GENEB: Why Genomic Models Are Hard to Compare

Title: GENEB: The Challenges of Comparing Genomic Models

Abstract

Assessing advancements in genomic foundation models remains a significant challenge, primarily due to fragmented benchmarking systems, inconsistent evaluation protocols, and the reliance on task-specific reporting. Consequently, assertions regarding a model’s superiority or general applicability are frequently difficult to compare directly. To address this, we present GENEB, a comprehensive diagnostic benchmark designed to evaluate frozen representations from 40 genomic foundation models. This framework assesses performance across 100 distinct tasks organized into 13 functional categories, utilizing a unified probing-based protocol that encompasses few-shot scenarios.

GENEB facilitates controlled comparisons by isolating variables such as model scale, architecture, tokenization methods, and pretraining data, while also highlighting trade-offs at the task level. Our findings reveal that aggregate leaderboards are inherently unstable; model rankings fluctuate dramatically depending on the task category. Furthermore, the analysis indicates that increasing model scale yields only modest and inconsistent improvements, whereas alignment between architecture and pretraining data often proves more critical than the sheer number of parameters. These insights underscore the shortcomings of existing evaluation methodologies and establish GENEB as a standardized reference for principled comparison and category-aware model selection within genomic machine learning.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Exelon CEO Sees Daily Cybersecurity Threats
Bloomberg

Exelon CEO Sees Daily Cybersecurity Threats

Exelon’s CEO warns of daily cybersecurity threats, highlighting persistent risks to the energy giant.

TechCrunch

Ramp raises $750M at $44B valuation as investors hunger for fintechs with an AI story

Ramp secured $750M at a $44B valuation, driven by AI integration and $1.5B+ revenue. The fintech firm now serves 70,000 ...

TechCrunch

Is Silicon Valley ready to put robots in people’s homes? Hello Robot is.

Hello Robot’s Stretch avoids Silicon Valley hype, focusing on practical home deployment to gather essential real-world d...

Canada to Provide Funding, Buy Equity Stakes in AI Startups
Bloomberg

Canada to Provide Funding, Buy Equity Stakes in AI Startups

Canada will fund and buy equity stakes in AI startups to boost the sector. This investment aims to strengthen the nation...

TechCrunch

Chinese spies are using LinkedIn to lure Westerners into sharing sensitive information

A joint Western security alert warns that Chinese spies use LinkedIn to impersonate recruiters and extract sensitive dat...

Peter Thiel’s Family Office Pays Record Rent for Top Miami Tower
Bloomberg

Peter Thiel’s Family Office Pays Record Rent for Top Miami Tower

Peter Thiel’s family office set a record rent for a Miami tower lease. This deal establishes a new benchmark for the cit...