arXiv

Probing Minimalist Phase Structure in LLMs: What Universal Dependencies Cannot Represent

Title: Investigating Minimalist Phase Architecture in Large Language Models: The Limitations of Universal Dependencies

Abstract:

Structural probes rely on Universal Dependencies (UD), a framework that fails to capture formal-syntactic abstractions such as phase boundaries or the internal cohesion within phases. Consequently, whether large language models (LLMs) encode these concepts remains an unresolved question, one that UD-based probing methods are structurally incapable of addressing. To investigate this, we applied structural probes to wh-movement stimuli, specifically selecting cases where UD distances remain constant across different conditions by design. Any observed non-zero effect in these scenarios must therefore stem from syntactic structures that lie beyond the scope of UD.

Our study examined three conditions—bare small clauses, infinitivals, and finite clauses—which are ranked according to the number of Minimalist Program (MP) phase boundaries traversed by the wh-element. In an analysis spanning 13 LLMs from four distinct families, we identified a phase-count gradient in cross-clause pairs, a pattern present in 12 out of 13 models. Additionally, we observed a sign asymmetry in within-clause pairs in all 13 models. Notably, the UD distance for these within-clause pairs is identical across all conditions, yet the asymmetry persists. This specific phenomenon aligns with the MP abstraction of phase-internal cohesion, a structural feature inherently invisible to UD.

Furthermore, activation patching techniques confirmed that these representations are causally active in 12 of the 13 models. These results indicate that distributional pretraining has the capacity to induce representations that correspond to formal-syntactic abstractions exceeding the reach of annotation-based probing. Ultimately, UD-grounded probes serve as a lower bound for syntactic encoding rather than an upper limit.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Law’s Billable Hour Is Being Shredded by AI
Bloomberg

Law’s Billable Hour Is Being Shredded by AI

AI is dismantling the billable hour by automating routine legal tasks. This technological shift threatens the traditiona...

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026
Bloomberg

Iran War: Trump Tries to Stop Israel’s Lebanon Push | The Opening Trade 6/2/2026

SoftBank in Early Talks to Back $800 Million Agile Robots Round
Bloomberg

SoftBank in Early Talks to Back $800 Million Agile Robots Round

SoftBank is in early talks to back Agile Robots’ $800 million funding round. The Japanese tech giant is currently in pre...

Amundi Is Diversifying Risk Via Commodity Currencies, Gold
Bloomberg

Amundi Is Diversifying Risk Via Commodity Currencies, Gold

Amundi diversifies risk by investing in commodity-linked currencies and gold. This strategy hedges against market volati...

Reuters

Marvell Technology surges after Nvidia's Huang calls it 'next trillion-dollar company'

Marvell Technology shares surged after Nvidia CEO Jensen Huang labeled the firm the “next trillion-dollar company.”

Russia Says It Found Foreign Spyware on Top Officials’ Phones
Bloomberg

Russia Says It Found Foreign Spyware on Top Officials’ Phones

Russia’s FSB claims to have discovered foreign spyware on senior officials’ phones. Moscow attributes the intrusion to h...