arXiv

Reasoning over Grammar: Can Synthetic Linguistic Reasoning Traces Enhance Low-Resource Machine Translation?

Title: Leveraging Grammar: Do Synthetic Linguistic Reasoning Traces Boost Machine Translation in Low-Resource Settings?

Abstract:

Large language models (LLMs) present a viable strategy for machine translation (MT) in extremely low-resource languages by integrating linguistic resources via in-context learning. Nevertheless, these models frequently encounter difficulties in effectively utilizing grammatical data during the translation process. Drawing inspiration from recent advancements in chain-of-thought reasoning, this study explores whether low-resource MT can be enhanced by employing structured intermediate steps that involve linguistic analysis and grammatical reasoning.

We introduce a pipeline designed to automatically generate step-by-step linguistic reasoning traces derived from Universal Dependencies treebanks, dictionaries, and grammar-rule banks. To assess the efficacy of these traces, we conducted evaluations across three distinct settings: in-context learning (ICL), supervised fine-tuning (SFT), and reinforcement fine-tuning (RFT), using Xibe and Chintang as case studies.

Our findings indicate that linguistic reasoning traces serve as the most effective tool when applied as guidance during inference. Specifically, within the ICL framework, reliable, sentence-specific traces significantly boost translation performance across a wide range of models, languages, and evaluation metrics. Conversely, utilizing these traces as training data results in more modest and inconsistent improvements. This discrepancy arises because, while models successfully learn the format of the traces, they frequently produce inaccurate content. These results imply that while LLMs can effectively harness grammatical information for low-resource MT given access to reliable linguistic analyses, the ability to generate such analyses autonomously remains a significant challenge.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...