arXiv

End-to-End Text Line Detection and Ordering

Title: Integrated Detection and Sequencing of Text Lines

Abstract:

Standard workflows for recognizing text in historical documents usually split layout analysis into two distinct phases: first detecting text lines, and then determining their reading order. The ordering phase typically relies on hand-coded geometric rules, which often fail when dealing with marginalia, multi-column layouts, tables, or specific editorial styles unique to certain sources. To address these limitations, this paper presents Orli (Ordered Regression of Lines), an end-to-end framework that unifies both tasks into a single image-to-sequence challenge. Orli works by autoregressively generating text-line baselines directly in the correct reading sequence from a page image. The baselines utilize a chord-frame parameterization to define a line’s position, orientation, and length, while capturing local geometry via perpendicular offsets. A final curve is produced through an iterative refinement head combined with a local visual refiner.

Trained on a diverse dataset of 196,691 pages covering ten different writing systems, Orli slightly surpasses the current state of the art for cBAD line detection without requiring dataset-specific training. It achieves near-perfect coverage and ordering accuracy across various reading-order benchmarks in a zero-shot setting. Furthermore, the model can adapt to specialized out-of-domain layouts with minimal fine-tuning. The source code and model weights are publicly accessible under an open license at https://github.com/mittagessen/orli.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Glazer Family Members Said to Study Manchester United Stake Sale
Bloomberg

Glazer Family Members Said to Study Manchester United Stake Sale

Reports indicate the Glazer family is evaluating a potential sale of their Manchester United stake, with family members ...

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines
Bloomberg

Ares' Blair Jacbobson: Disconnect Over Private Credit Headlines

Ares’ Blair Jacobson argues that private credit headlines misrepresent reality, highlighting a disconnect between media ...

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion
Bloomberg

Nvidia-Backed Robotics Startup Generalist AI Valued at $2 Billion

Nvidia-backed robotics startup Generalist AI has reached a $2 billion valuation. Founders Pete Florence, Andy Zeng, and ...

TechCrunch

Oura Ring 5 review: Thinner, lighter, better

The Oura Ring 5 is 40% smaller and lighter than its predecessor, offering superior comfort and a discreet, jewelry-like ...

Financial Times

How AI has de-skilled translation

AI fragments specialist translation into routine tasks, effectively de-skilling the profession. This shift reduces compl...

Zurich Insurance Expands Data-Center Offering Beyond the US
Bloomberg

Zurich Insurance Expands Data-Center Offering Beyond the US

Zurich Insurance Group is expanding its data center insurance products internationally, extending coverage beyond the Un...