arXiv

DAR: Deontic Reasoning with Agentic Harnesses

Title: DAR: Deontic Reasoning with Agentic Harnesses

Original: arXiv:2606.05009v1 Announce Type: cross Abstract: Deontic reasoning is the task of answering questions by applying explicit rules and policies to case-specific facts, for example computing tax liability under a statute or determining the outcome of an immigration appeal. A key technical challenge for LLM-based deontic reasoning is that the relevant ruleset can be long and cross-referenced, so models may still fail to locate the rules needed for a particular reasoning step. We introduce Deontic Agentic Reasoning (DAR), an agentic reasoning setup in which the model interacts with the statutes on demand. We evaluate DAR under multiple harnesses on hard subsets of DeonticBench. Across these settings, we find that agentic harnesses can push the frontier on deontic reasoning tasks, but improvements are not uniform: weaker models often degrade on numerical tasks while consuming far more tokens.

Rewrite: Abstract: Deontic reasoning involves resolving inquiries by applying specific regulations and policies to unique factual scenarios, such as calculating tax obligations according to legal statutes or adjudicating immigration appeals. For large language models (LLMs), a significant technical hurdle in this domain is that the applicable rule sets are often extensive and interlinked, leading to potential failures in identifying the precise rules required for specific logical steps. To address this, we propose Deontic Agentic Reasoning (DAR), a framework where the model engages with statutes interactively as needed. We assess DAR using various harnesses on challenging subsets of the DeonticBench dataset. Our findings indicate that while agentic harnesses can advance the capabilities of models in deontic reasoning tasks, the benefits are inconsistent: less capable models frequently experience performance declines on numerical problems despite utilizing significantly higher token counts.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

TechCrunch

Helion, the Sam Altman-backed fusion startup, raises $465M to build a power plant for Microsoft

Sam Altman-backed Helion raised $465M to build a fusion plant for Microsoft, aiming for grid connection by 2028 using di...

Fed Weighs Need For Rate Hikes, US May Payrolls Due Out Friday | Real Yield 6/4/2026
Bloomberg

Fed Weighs Need For Rate Hikes, US May Payrolls Due Out Friday | Real Yield 6/4/2026

The Fed weighs rate hikes as US non-farm payrolls data drops Friday.

Shark Tank Star Shrinks Data Center Footprint After Backlash
Bloomberg

Shark Tank Star Shrinks Data Center Footprint After Backlash

After public backlash, a Shark Tank entrepreneur reduced the size of a Utah data center project. This decision followed ...

Hatch’s New Bedside Sleep Clock Wirelessly Tracks Sleep Quality
Bloomberg

Hatch’s New Bedside Sleep Clock Wirelessly Tracks Sleep Quality

Hatch’s $250 screen-free sleep clock wirelessly tracks breathing, heart rate, and movement using low-power signals, offe...

Anduril's Stephens on Innovating in an Age of War
Bloomberg

Anduril's Stephens on Innovating in an Age of War

At Bloomberg Tech 2026, Anduril’s Stephens discussed AI’s role in defense and military innovation amid global conflict.

Liftoff Mobile CEO Talks IPO, Advertising and Strategy
Bloomberg

Liftoff Mobile CEO Talks IPO, Advertising and Strategy

Liftoff Mobile’s CEO discusses IPO plans, navigating ad market trends, and outlining the company's strategic direction f...