arXiv

R-APS: Compositional Reasoning and In-Context Meta-Learning for Constrained Design via Reflective Adversarial Pareto Search

Title: R-APS: Compositional Reasoning and In-Context Meta-Learning for Constrained Design via Reflective Adversarial Pareto Search

Abstract:

While Large Language Models (LLMs) demonstrate fluency in open-ended tasks, this capability does not guarantee reliable performance in agentic environments. In such settings, systems must plan, utilize tools, and execute actions over extended periods. We attribute this reliability gap to three interconnected structural deficiencies: the lack of error localization, the absence of evaluation for worst-case perturbations, and the failure to invalidate accumulated knowledge. We posit that these issues stem from a common source: the conflicting demands placed on shared context by abductive, counterfactual, meta-inductive, corrective, and inductive reasoning modes.

To address these challenges simultaneously, we introduce Reflective Adversarial Pareto Search (R-APS). To the best of our knowledge, this is the first approach that tackles all three failures through reasoning-mode decomposition. This method assigns a dedicated context to each reasoning mode and orchestrates their interaction across three distinct timescales: staged compositional reasoning paired with a typed validation critic for failure localization; sensitivity-guided counterfactual stress-testing as a primary Pareto objective for robustness; and meta-inductive rule extraction with explicit invalidation for persistent memory management. R-APS operates without fine-tuning, relying solely on structured protocol design to function with a frozen LLM.

We evaluated the method on planar mechanism synthesis tasks relevant to robotics, prosthetics, and mechanical design, where every candidate design was verified by a kinematic solver. Across 32 target trajectories, R-APS achieved robustness certificates 3.5 times tighter than those from uniform-perturbation baselines. It also completed iterations-to-first-admission 46% faster and reduced Chamfer distance by a factor of 2.1 compared to an Enum+GA baseline, all while simultaneously controlling for bar count and worst-case robustness. Furthermore, experiments with small 4B reasoning-specialized models showed they could compete with general-purpose 70B backbones within the protocol, indicating that structured protocols can mitigate the need for large model scales.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

Reuters

Meta repeatedly pushes back new AI model release for developers, WSJ says

Meta has repeatedly delayed the release of its new AI model for developers, according to the WSJ. This ongoing postponem...

TechCrunch

Benchmark raises its first-ever growth fund as part of $2B capital raise

Benchmark Capital launches its first growth fund, raising $2 billion to target later-stage AI deals. This marks a strate...

Netflix Aims to Use AI to Help Viewers Manage Content Overload
Bloomberg

Netflix Aims to Use AI to Help Viewers Manage Content Overload

Netflix uses AI to help viewers manage content overload, tackling the challenge of too many choices.

TSMC CEO Warns Chip Supply Won’t Meet AI-Fueled Demand for Years
Bloomberg

TSMC CEO Warns Chip Supply Won’t Meet AI-Fueled Demand for Years

TSMC CEO warns that chip supply will lag behind surging AI demand for years. This multi-year shortfall highlights the in...

Reuters

TSMC boss upbeat on outlook as AI boom shows no sign of easing

TSMC executives remain optimistic as sustained AI demand shows no signs of slowing, driving strong confidence in the com...

Bitcoin Falls to Pre-Iran Conflict Low as Crypto Slide Extends
Bloomberg

Bitcoin Falls to Pre-Iran Conflict Low as Crypto Slide Extends

Bitcoin drops to its lowest level before the Iran conflict, extending a broader cryptocurrency decline.