Technology News - Global News Digest

arXiv

TrafficClaw: A Generalizable LLM Agent in the Unified Physical Environment for Urban Traffic Control

June 2, 2026 · Siqi Lai, Pan Zhang, Yuping Zhou, Jindong Han, Yansong Ning, Hao Liu

TrafficClaw is an LLM agent for unified urban traffic control, using spatiotemporal reasoning and RL to optimize interconnected subsystems. It demonstrates robust generalization and cross-subsystem coordination across six tasks in three major cities.

arXiv

Perspective on Bias in Biomedical AI: Preventing Downstream Healthcare Disparities

June 2, 2026 · Michal Rosen-Zvi, Yoav Kan-Tor, Michael Danziger, Agata Ferretti, Javier Aula-Blasco, Julia Falcao, Ron Shamir, Mira Marcus-Kalish, Mordechai Muszkat

Biomedical AI risks exacerbating healthcare disparities due to biased, non-representative training data. Adopting transparency and provenance standards is essential to mitigate these inequities.

arXiv

Neural Decision-Propagation for Answer Set Programming

June 2, 2026 · Thomas Eiter, Katsumi Inoue, Sota Moriyama

Neural Decision-Propagation (NDProp) replaces classical ASP solvers with neural networks and fuzzy logic to improve scalability. It efficiently learns stable models, enhancing accuracy in neuro-symbolic benchmarks.

arXiv

Trustworthy AI Suffers from Invariance Conflicts and Causality is The Solution

June 2, 2026 · Ruta Binkyte, Ivaxi Sheth, Zhijing Jin, Mohammad Havaei, Bernhard Sch\"olkopf, Mario Fritz

This paper argues that causal reasoning resolves invariance conflicts among trustworthy AI goals like fairness and robustness. It offers a framework to mitigate trade-offs in both ML and foundation models.

arXiv

From Features to Actions: Explainability in Traditional and Agentic AI Systems

June 2, 2026 · Sindhuja Chaduvula, Jessee Ho, Kina Kim, Aravind Narayanan, Ahmed Y. Radwan, Mahshid Alinoori, Muskan Garg, Dhanesh Ramachandram, Shaina Raza

This study contrasts static feature attribution with trace-based diagnostics, revealing that the latter effectively diagnoses agentic AI failures. It advocates for trajectory-level explainability to evaluate autonomous behaviors.

arXiv

KnowledgeBerg: Evaluating Systematic Knowledge Coverage and Compositional Reasoning in Large Language Models

June 2, 2026 · Xiao Zhang, Qianru Meng, Yongjian Chen, Yumeng Wang, Johan Bos

KnowledgeBerg benchmarks LLMs on systematic knowledge coverage and compositional reasoning, revealing significant performance deficits across models and languages. Despite improvements from test-time compute and retrieval, enduring limitations in structured knowledge management persist.

arXiv

ANDRE: An Attention-based Neuro-symbolic Differentiable Rule Extractor for Inductive Logic Programming

June 2, 2026 · Iman Sharifi, Peng Wei, Saber Fallah

ANDRE is a neuro-symbolic ILP framework using attention-based differentiable operators to extract interpretable rules from noisy, probabilistic data. It outperforms existing methods in stability and rule quality.

arXiv

The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models

June 2, 2026 · Alif Al Hasan, Sumon Biswas

This audit reveals LLMs trade off over-refusal of safe queries against harmful compliance, with safety behaviors driven more by post-training objectives than architecture.

arXiv

Towards a Virtual Neuroscientist: Autonomous Neuroimaging Analysis via Multi-Agent Collaboration

June 2, 2026 · Keqi Han, Songlin Zhao, Yao Su, Xiang Li, Yixuan Yuan, Lifang He, Carl Yang

NEXUS is an autonomous multi-agent framework for neuroimaging that dynamically adapts workflows to improve biomarker prediction. It outperforms static pipelines on ADHD-200 and ADNI datasets through collaborative, code-centric execution and hierarchical quality control.

arXiv

Causal state binding predicts action control in language agents

June 2, 2026 · Xiao Jia

The study introduces "causal state binding" to verify if language agents’ internal states genuinely drive actions. Results show structured agents outperform controls, improving constraint-clean issue localization.

arXiv

RADAR: Redundancy-Aware Diffusion for Multi-Agent Communication Structure Generation

June 2, 2026 · Zhen Zhang, Wanjing Zhou, Juncheng Li, Hao Fei, Jun Wen, Wei Ji

RADAR is a redundancy-aware diffusion framework that iteratively generates adaptive multi-agent communication topologies. It outperforms baselines in accuracy, robustness, and token efficiency across six benchmarks.

arXiv

CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing

June 2, 2026 · Ming Du, Xiangyu Yin, Yanqi Luo, Dishant Beniwal, Songyuan Tang, Hemant Sharma, Mathew J. Cherukara

CVEvolve is an autonomous, zero-code framework using LLMs to discover algorithms for unstructured scientific data. It outperforms baselines and generalizes better, empowering scientists to process complex images without coding expertise.

arXiv

MMSkills: Towards Multimodal Skills for General Visual Agents

June 2, 2026 · Kangning Zhang, Shuai Shao, Qingyao Li, Jianghao Lin, Lingyue Fu, Shijian Wang, Wenxiang Jiao, Yuan Lu, Weiwen Liu, Weinan Zhang, Yong Yu

MMSkills introduces reusable multimodal procedural knowledge for visual agents, using state-conditioned packages to enhance runtime decision-making. It converts public trajectories into skills via an agentic generator and branch-loaded deployment.

arXiv

Herculean: An Agentic Benchmark for Financial Intelligence

June 2, 2026 · Xueqing Peng, Zhuohan Xie, Yupeng Cao, Haohang Li, Lingfei Qian, Yan Wang, Vincent Jim Zhang, Huan He, Xuguang Ai, Linhai Ma, Ruoyu Xiang, Yueru He, Yi Han, Shuyao Wang, Yuqing Guo, Mingyang Jiang, Yilun Zhao, Youzhong Dong, Xiaoyu Wang, Yankai Chen, Ye Y

Herculean is a new benchmark evaluating AI agents' agentic financial skills across trading, hedging, insights, and auditing. It reveals that while agents handle trading well, they struggle with complex, long-horizon tasks like hedging and auditing.

arXiv

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

June 2, 2026 · Fangzhou Wu, Sandeep Silwal, Qiuyi Zhang

ECC enhances LLM capability assessment by refining semantic embeddings with posterior model comparisons. It outperforms baselines by ~18% and improves downstream applications like query routing.

arXiv

Evaluating Deep Research Agents on Expert Consulting Work: A Benchmark with Verifiers, Rubrics, and Cognitive Traps

June 2, 2026 · Tanmay Asthana, Aman Saksena, Divyansh Sahu

This benchmark reveals that top deep research agents struggle with expert consulting tasks, achieving acceptance rates below 22%. The study highlights significant gaps in multi-document analysis and susceptibility to cognitive traps.

arXiv

Coding Agent Is Good As World Simulator

June 2, 2026 · Hongyu Wang, Jingquan Wang, Bocheng Zou, Radu Serban, Dan Negrut

This paper introduces a coding agent framework that generates physics-based world models via executable code, outperforming video-based models in physical accuracy and visual quality for simulations.

arXiv

Ethical Hyper-Velocity (EHV): A Hardware-Rooted Zero-Trust Runtime Enforcement Architecture for Agentic AI Systems

June 2, 2026 · Riddhi Mohan Sharma

EHV is a hardware-rooted architecture for agentic AI that enforces policies in O(1) time using TEEs and formal verification. It eliminates governance latency, ensuring safety without compromising deployment speed.

arXiv

Towards a General Intelligence and Interface for Wearable Health Data

June 2, 2026 · Girish Narayanswamy, Maxwell A. Xu, A. Ali Heydari, Samy Abdel-Ghaffar, Marius Guerard, Kara Vaillancourt, Zhihan Zhang, Jake Garrison, Levi Albuquerque, Dimitris Spathis, Hong Yu, Hamid Palangi, Xuhai "Orson" Xu, David G. T. Barrett, Joseph Breda, Jed Mc

Researchers developed a foundation model for wearable health data, pretrained on massive unlabeled datasets, to improve health predictions. Integrated with LLM agents, it enables a Personal Health Agent for context-aware, safe insights.

arXiv

LLM-Guided Communication for Cooperative Multi-Agent Reinforcement Learning

June 2, 2026 · Sangjun Bae, Yisak Park, Sanghyeon Lee, Seungyul Han

LMAC uses LLMs to optimize communication protocols, enabling agents to accurately reconstruct shared states. This approach significantly improves performance and state recovery in cooperative multi-agent reinforcement learning benchmarks.