Technology
MindGames Arena Generalization Track: In2AI Solution with Delayed Per-Step Reward Attribution
In2AI’s delayed reward attribution and efficient RL framework secured first place in the MindGames Arena Generalization Track, outperforming larger proprietary models like GPT-5.
Universal Quantum Transformer
The Universal Quantum Transformer achieves exact, deterministic mathematical reasoning on NISQ hardware, outperforming classical models by avoiding stochastic instability and quadratic attention bottlenecks.
Agents on a Tree: Pathwise Coordination for Multi-Objective Molecular Optimization
ATOM is a multi-agent framework for molecular optimization that coordinates agents along tree paths to balance conflicting objectives. It outperforms baselines in Pareto coverage and hypervolume by preserving diverse design trajectories.
Emergent Collaborative Deliberation in Multi-Model AI Systems: A BFT-Derived Protocol for Epistemic Synthesis
The Consilium Protocol uses BFT principles to synthesize AI deliberations, revealing persona-driven epistemic behavior and RLHF-induced biases. It validates that low-cost models match frontier outputs when decoupled from identity.
Deliberative Curation: A Protocol for Multi-Agent Knowledge Bases
This paper proposes a deliberative curation protocol for multi-agent knowledge bases, enhancing resilience against adversity. Simulations show it outperforms majority voting, particularly under stress, by mitigating agent-specific governance challenges.
Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations
This paper proposes a post-solve auditing layer for MILP engines to evaluate solution robustness against perturbations. It defines feasible neighborhoods and solution smoothness to enhance reliability in industrial decision systems.
Probe Before You Edit: Probing-Guided Molecular Optimization for LLM Agents in Structure-Based Drug Design
PROBE guides LLM agents in drug design by probing pocket-ligand responses to create an EditManual, enabling collaborative optimization of binding affinity and druggability. This approach achieves state-of-the-art results on CrossDocked2020 by reducing common failure modes.
PropLLM: Propagation-Aware Scene Reconstruction for Network Fault Diagnosis
PropLLM reconstructs fault propagation paths using LLMs and causal attention, improving diagnosis accuracy and reducing hallucinations. It outperforms baselines on Wi-Fi and 5G datasets by tracing evidence step-by-step.
Efficient Test-time Inference for Generative Planning Models
This study proposes an efficient inference method for generative planning models by integrating learned models into a modified Open-Closed List search. The approach outperforms baselines in solution quality and computational efficiency across combinatorial domains.
TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety
TRACE compresses long-horizon agent trajectories into latent safety evidence, outperforming baselines by 12.6% on benchmarks like ASSEBench and demonstrating superior stability as context length increases.
Hidden Thoughts Are Not Secret: Reasoning Trace Exposure in LLMs
The study introduces Reasoning Exposure Prompting (REP), a technique that extracts hidden reasoning traces from LLMs despite interface-level concealment. Results show REP effectively exposes internal reasoning signals for model distillation.
Medication-Aware Financial Exploitation Detection for Alzheimer's Patients Using Edge-Aware Interaction Risk Modeling
This study integrates medication adherence with financial data to detect exploitation in Alzheimer’s patients. The interaction-aware model significantly improved recall during vulnerable periods compared to financial-only baselines.
AXIOM: A Trust-First Neuro-Symbolic Execution Architecture for Verifiable Mathematical Reasoning
AXIOM uses LLMs only for canonicalization, routing to deterministic CAS pipelines for verifiable math reasoning. It achieves 94.36% correctness with 100% trust and zero confident-wrong answers.
Regularized Offline Policy Optimization with Posterior Hybrid Bayesian Belief
PhyB addresses offline RL’s computational challenges by approximating Bayesian expectations via convex combinations of dynamics models. This regularized optimization ensures monotonic improvement and achieves state-of-the-art benchmark performance.
LLM-Driven Co-Evolutionary Automated Heuristic Design for Bi-Component Coupled Combinatorial Optimization
CoEvo-AHD uses LLMs to co-evolve coupled heuristics for bi-component problems like TTP and TPP. It outperforms traditional methods by modeling operator interactions via cooperative evaluation and joint crossover.
ForeSci: Evaluating LLM Agents for Forward-Looking AI Research Judgment
ForeSci evaluates LLM agents’ forward-looking research judgment using a controlled benchmark. It reveals that while evidence organization helps, agents often decouple evidence from decisions.
MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition
MOSAIC is a framework for structured agentic model selection that uses semantic profiles and historical data to generate validated, reusable code blueprints. It improves upon existing AutoML and LLM agents by ensuring verifiable, context-aware workflow construction.
SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition
SHARP decouples memory and recognition, using accelerated offline replay to capture long-range, non-stationary patterns. It outperforms RNNs on benchmarks while maintaining streaming efficiency.
Latent Reward Steering: An Adaptive Inference-Time Framework that Implicitly Promotes Cognitive Behaviors in Reasoning LLMs
Latent Reward Steering (LRS) adaptively corrects reasoning errors in LLMs by optimizing sparse autoencoder latent states via a reward model. This inference-time framework implicitly promotes beneficial cognitive behaviors without predefined schemas.
AI Sovereignty as National Learning Capacity: A Human-Centered Learning Mechanics Viewpoint on France, the United States, and China
This paper proposes viewing France’s AI strategy through Human-Centered Learning Mechanics, framing sovereignty as managing information injection versus entropy dissipation. It argues for a balanced, human-centric approach to national AI advancement.