Global News Digest

Technology

arXiv

Coupling Language Models with Physics-based Simulation for Synthesis of Inorganic Materials

This study integrates LLMs with physics-based simulations to plan inorganic material synthesis. Results show LLMs generate more feasible strategies than traditional algorithms, leveraging their implicit knowledge.

arXiv

The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary

Extended reasoning fails deterministic tasks due to architectural limits, creating a "Deterministic Horizon" where tool delegation is essential. Hybrid approaches significantly outperform pure neural methods, confirming an inherent capability ceiling.

arXiv

From Noise to Control: Parameterized Diffusion Policies

Parameterized Diffusion Policy (PDP) conditions diffusion on a learned behavior manifold, enabling precise control and smooth interpolation. It outperforms standard policies by synthesizing novel behaviors without weight updates.

arXiv

From "Weak" Signals to Strong Models: Preference Delta Aggregation with LoRA Merging

Preference Delta Aggregation merges LoRA adapters from weak model pairs to boost strong LLMs. Combined with Geometric Alignment Merging, it significantly outperforms baselines in reasoning and agentic search.

arXiv

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

This paper proposes ICAM, a six-layer framework for model-native computing, resolving LLM roles via a dual-plane perspective. It introduces three design laws to address cache, context, and agent efficiency challenges in future system architectures.

arXiv

Evaluating Bivariate Causal Statements Based on Mutual Compatibility

This study proposes compatibility and incompatibility scores to evaluate bivariate causal statements without relying on ground truth. These metrics effectively distinguish accurate from erroneous claims, aiding validation of insights from humans or AI.

arXiv

On Wednesdays, We Ask Questions: Optimizing "Active Listening" in Automated Legal Triage and Referral

The study finds that while cheap LLMs classify well, high-cost models like GPT-5 are needed for effective legal triage questions. However, inconsistencies in specific areas like domestic violence highlight the need for specialized screening modules.

arXiv

Robust Shielding for Safe Reinforcement Learning

This paper introduces robust shielding for safe reinforcement learning in unknown environments. It guarantees safety under worst-case transitions while allowing optimal agent behavior.

arXiv

MindZero: Learning Online Mental Reasoning With Zero Annotations

MindZero enables efficient online mental reasoning in MLLMs via self-supervised reinforcement learning, eliminating annotation needs. It outperforms model-based methods in speed and accuracy for real-time AI assistance.

arXiv

Capability Self-Assessment: Teaching LLMs to Know Their Limits

This study shows LLMs overestimate competence, but reinforcement learning effectively teaches Capability Self-Assessment (CSA) without degrading performance. CSA generalizes well and improves decision-making and training data selection.

arXiv

Closed-Loop Neural Activation Control in Vision-Language-Action Models

CTRL-STEER introduces a closed-loop framework for VLA models, using adaptive control signals to replace static steering. This approach enhances task success and stability by dynamically adjusting interventions based on real-time feedback.

arXiv

Geodesic Flow Matching for Denoising High-Dimensional Structured Representations

Geodesic Flow Matching denoises Spatial Semantic Pointers on toroidal manifolds, avoiding Euclidean flaws. It reduces SLAM tracking error by 72% and boosts neural efficiency by 40%.

arXiv

TIGER: Traceable Inference with Graph-Based Evidence Routing for Mitigating Hallucinations in Multimodal Generation

TIGER mitigates multimodal hallucinations by routing evidence via graph-based risk scoring for targeted, parameter-free fact repair. It geometrically reduces risk while maintaining task quality across diverse cross-modal pathways.

arXiv

CAST: Non-Privileged Clipped Asymmetric Self-Teaching with Advantage Flipping for GRPO

CAST enhances GRPO via non-privileged, clipped asymmetric self-teaching. It uses advantage flipping to correct token-level signals without reference answers.

arXiv

Grokers: Bottom-Up Inductive Comprehension and Write-Time Intelligence over Typed Knowledge Graphs

Grovers shifts comprehension to the write phase via bottom-up inductive traversal, eliminating per-query LM costs. It ensures high KV-cache hits and zero fallback rates through deterministic, theorem-backed indexing.

arXiv

A Multi-AI-agent Framework Enabling End-to-end Finite Element Analysis for Solid Mechanics Problems

AbaqusAgent uses six LLM-based agents to automate end-to-end Finite Element Analysis, achieving an 86% success rate across 50 solid mechanics problems. This framework simplifies FEA workflows and lowers entry barriers for users.

arXiv

Product-Aware Deep Autoencoders for Robust Process Monitoring in Multi-Product Cyber-Physical Systems

Product-aware autoencoders outperform global models in multi-product CPS, achieving 100% attack detection versus 22.2%. This approach eliminates blind spots caused by aggregated operational variance.

arXiv

Evaluating Interactive Reasoning in Large Language Models: A Hierarchical Benchmark with Executable Games

This paper introduces a hierarchical benchmark using 474 executable games to evaluate LLMs' interactive reasoning, evidence gathering, and metacognitive adaptation. Results show significant performance disparities, particularly in counterfactual revision tasks.

arXiv

On the evolution of the concept of probability as a mirror of the evolution of reason

This paper traces probability’s evolution as a mirror of reason, identifying its limits in handling conceptual ambiguity. It positions fuzzy logic and deep learning as complementary tools for modern scientific rationality.

arXiv

Optimal Transport-based Permutation-Invariant Bayesian Optimization of Offshore Wind Farm Layouts

This study introduces Permutation-Invariant Bayesian Optimization (PIBO) using Optimal Transport to optimize offshore wind farm layouts. PIBO outperforms traditional methods, reducing computational time by 50% while generating superior configurations.