Technology News - Global News Digest

arXiv

ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL

June 2, 2026 · Zelin He, Haotian Lin, Boran Han, Wei Zhu, Haoyang Fang, Bernie Wang, Xuan Zhu, Runze Li, Matthew Reimherr

ReSkill harmonizes skill creation with policy optimization in Agentic RL using assertion-driven updates and Thompson Sampling. It outperforms existing methods by ensuring skills co-evolve with the policy, significantly boosting performance on unseen tasks.

arXiv

MobEvolve: An Agentic Self-Evolving Heuristic System for Interpretable Human Mobility Generation

June 2, 2026 · Junlin He, Yihong Tang, Tong Nie, Ao Qu, Yuebing Liang, Hamzeh Alizadeh, Bang Liu, Wei Ma, Lijun Sun

MobEvolve is an agentic, self-evolving heuristic system that uses LLM agents to refine mobility models. It outperforms existing methods in realism, interpretability, and efficiency on Singapore and Montreal datasets.

arXiv

Evidence-Gated LLM Priors for Multi-Objective Bayesian Optimization

June 2, 2026 · Jiangyu Chen, Banyi

This study introduces an evidence-gated framework for multi-objective Bayesian optimization, dynamically calibrating LLM priors via objective-specific reputation markets. Results show this approach enhances robustness over static priors, though raw LLM confidence proves inconsistent across benchmarks.

arXiv

S-SPPO: Semantic-Calibrated Self-Play Preference Optimization

June 2, 2026 · Xiwen Chen, Wenhui Zhu, Jingjing Wang, Peijie Qiu, Zhipeng Wang, Huayu Li, ZhengXiao He, Xuanzhao Dong, Prayag Tiwari, Mingkun Xu, Yujian Xiong, Feng Luo, Abolfazl Razi, Brendan Hogan Rappazzo, Anderson Schneider, Yuriy Nevmyvaka

S-SPPO stabilizes Self-Play Preference Optimization via semantic calibration, preventing policy degeneration. It achieves superior AlpacaEval 2.0 performance with Llama-3-8B without extra human annotations.

arXiv

TrafficRAG: A Multimodal RAG Framework for Traffic Accident Liability Determination

June 2, 2026 · Xu Li, Zedong Fu, Xinyi Li, Xun Han

TrafficRAG is a multimodal RAG framework that combines vision-language models with hybrid retrieval to automate traffic accident liability determination. It outperforms baselines, achieving 77.32% legal accuracy and 81.71% factual faithfulness.

arXiv

Characterization of Multi-Model Agentic AI Systems on General Tasks via Trace-Driven Simulation

June 2, 2026 · Donghwan Kim, Prakhar Singh, Younghoon Min, Jongryool Kim, Jongse Park, Kiwan Maeng

This study introduces GAIATrace and Vidur-Agent to simulate and analyze multi-model agentic AI systems. These tools enable reproducible, cost-effective evaluation of system dynamics and design choices on general tasks.

arXiv

TriAlign: Towards Universal Truth Consistency in Personalized LLM Alignment

June 2, 2026 · Thi-Nhung Nguyen, Linhao Luo, Rollin Omari, Junae Kim, Thuy-Trang Vu, Dinh Phung

TriAlign introduces Truth-Invariant Alignment via multi-agent RL, balancing personalized LLM outputs with universal truth consistency. It reduces factual disparities across social groups while maintaining high personalization quality.

arXiv

EvoBrain: Continual Learning of EEG Foundation Models Across Heterogeneous BCI Tasks

June 2, 2026 · Yangxuan Zhou, Sha Zhao, Jiquan Wang, Shijian Li, Gang Pan

EvoBrain enables continual learning for EEG foundation models via Neuro-Spectral Task Normalization and Response-Affinity Distillation. It outperforms SOTA methods across six BCI tasks, achieving unified decoding with minimal catastrophic forgetting.

arXiv

Stochastic convergence of parallel asynchronous adaptive first-order methods

June 2, 2026 · Serge Gratton, Philippe L. Toint

This paper analyzes the stochastic convergence of parallel asynchronous adaptive first-order methods for non-convex optimization, proving an O(1/sqrt{t}) rate. Empirical results confirm their suitability for large-scale, heterogeneous machine learning environments.

arXiv

Structure-Guided Adaptive Propagation for Protein-Protein Interaction Site Prediction

June 2, 2026 · Enqiang Zhu, Yizi Liu, Yilong Luo, Yao Chen, Yu Zhang, Baoshan Ma

SGAP-PPIS uses structure-guided adaptive propagation to dynamically adjust information diffusion for accurate protein-protein interaction site prediction. It outperforms rigid models by leveraging equivariant graph neural networks to tailor propagation to local geometric contexts.

arXiv

Consistency evaluation of benchmarks used for causal discovery

June 2, 2026 · Yuzhe Zhang, Chihui Chen, Lina Yao, Chen Wang

This study introduces an LLM-based workflow to verify 11 causal discovery benchmarks against 38,081 papers, revealing significant discrepancies with current literature. These findings highlight critical reliability issues in widely used benchmarks for causal discovery research.

arXiv

Token Predictors Are Not Planners: Building Physically Grounded Causal Reasoners

June 2, 2026 · Zheng Lu, Mingqi Gao, Qinlei Xie, Wanqi Zhong, Hanwen Cui, Heng Cao, Zirui Song, Yifan Yang, Chong Luo, Bei Liu, Yiming Li

The authors introduce Causal-Plan-Bench and Causal Planner to shift embodied AI from token prediction to physical causal reasoning. Their model achieves superior performance by internalizing physical logic, validating a Causal Scaling Law.

arXiv

OctoT2I: A Self-Evolving Agentic Text-to-Image Router

June 2, 2026 · Xu Jiang, Bin Chen, Gehui Li, Yule Duan, Ronggang Wang, Jian Zhang

OctoT2I is a self-evolving agentic router that optimizes text-to-image generation via an unsupervised, multi-round routing strategy. It achieves superior speed and energy efficiency while maintaining high-quality outputs without human supervision.

arXiv

Evaluation of Baseline Methods for IDD-based SSD External Memory Search

June 2, 2026 · Yuki Suzuki, Alex Fukunaga

This study evaluates simple baseline methods for SSD-based A* search using immediate duplicate detection, addressing gaps in prior research on external memory search strategies.

arXiv

CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback

June 2, 2026 · Bin Chen, Xinye Liao, Yiming Liu, Xin Liao, Chonghan Liu

CAPF guides LLM search agents using verifier-side feedback to repair failed rollouts, boosting Qwen3-4B’s QA accuracy from 44.7% to 48.5% across seven benchmarks.

arXiv

EVA-Net: Subject-Independent EEG Motor Decoding with Video-Derived Motor Priors

June 2, 2026 · Ziyuan Li, Yueyu Sun, Yimeng Zhang

EVA-Net uses video-derived motor priors to align EEG features, enabling robust subject-independent decoding. It achieves an 8.66% LOSO accuracy gain on EEGMMI, outperforming text-based methods.

arXiv

Does Compression Preserve Uncertainty? A Unified Benchmark for Quantized and Sparse LLMs via Conformal Prediction

June 2, 2026 · Yujia Tong, Yuxi Wang, Yunyang Wan, Tian Zhang, Junhao Dong, Jingling Yuan

This study benchmarks quantized and sparse LLMs using conformal prediction, revealing that compression decouples accuracy from uncertainty reliability. It urges incorporating uncertainty-aware metrics into model compression workflows for safer deployment.

arXiv

Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

June 2, 2026 · Ailiya Borjigin, Igor Stadnyk, Ben Bilski, Maksym Chikita, Dmytro Kyrylenko, Sofiia Pidturkina, Julia Stadnyk

InKH is a framework for financial LLM agents that internalizes complexity via structured memory, reducing latency by 83% and token costs by 82% while improving task quality and auditability.

arXiv

WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

June 2, 2026 · Shuo Lu, Yinuo Xu, Kecheng Yu, Siru Jiang, Yongcan Yu, Yubin Wang, Haitao Yang, Yuxiang Zhang, Bin Wang, Ran He, Jian Liang

WorldCoder-Bench benchmarks LLMs on synthesizing physically grounded 3D worlds using StateProbe to verify hidden runtime states. Results show top models achieve low verification coverage, highlighting significant challenges in generating robust, interactive 3D environments.

arXiv

Community-Aware Assessment of Social Textual Engagement and Resonance: A Human-Centric Perspective on User-Generated Content Evaluation

June 2, 2026 · Tianjiao Li, Kai Zhao, Xiang Li, Yang Liu, Huyang Sun

The study introduces CASTER and MEDEA, a human-centric framework using Social-CoT to assess UGC resonance via simulated community personas. MEDEA outperforms baselines on the new CASTER-Bench with empathetic, interpretable reasoning.