Technology News - Global News Digest

arXiv

Monitoring Agentic Systems Before They're Reliable

June 2, 2026 · Marisa Ferrara Boston, Glen Hanson, Effi Georgala, JD Hudgens, Heather Frase

This paper proposes a structural monitoring framework for agentic systems, using variance and FMEA to detect integration flaws. It proves structural defects mask task-level errors, enabling automated triage of 97% of issues.

arXiv

Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition

June 2, 2026 · Shuo Zhang, Chenqi Li, Tingting Zhu

This paper introduces SAMN, a hyperparameter-free method for long-tailed recognition that uses monotonic normalization to rescale class weights. It improves performance by eliminating regularization reliance, achieving state-of-the-art results across benchmarks.

arXiv

Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation

June 2, 2026 · Siyuan Bian, Congrong Xu, Jun Gao

MDA eliminates flying-point artifacts by using mixture-density representations to predict multiple depth hypotheses per pixel. This approach robustly handles boundary ambiguity, transparency, and sky regions with minimal computational cost.

arXiv

SimSD: Simple Speculative Decoding in Diffusion Language Models

June 2, 2026 · Junxia Cui, Haotian Ye, Runchu Tian, Hongcan Guo, Jinya Jiang, Haoru Li, Chaojie Ren, Yiming Huang, Kaijie Zhu, Zhongkai Yu, Kun Zhou, Jingbo Shang

SimSD enables speculative decoding in diffusion LLMs via a plug-and-play masking strategy, restoring token-level verification without training. This boosts inference speed while preserving parallel decoding benefits.

arXiv

From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression

June 2, 2026 · Elia Cunegatti, Marcus Vukojevic, Erik Nielsen, Giovanni Iacca

SubFit compresses LLMs at the submodule level, outperforming layer-based methods in accuracy and speed. It achieves superior perplexity-accuracy trade-offs across various sparsity levels.

arXiv

Permissive Safety Through Trusted Inference: Verifiable Belief-Space Neural Safety Filters for Assured Interactive Robotics

June 2, 2026 · Haimin Hu

This paper certifies high-probability safety for belief-space neural filters using conformal prediction. It addresses runtime inference errors to enable less conservative, verifiable safety in interactive robotics.

arXiv

Algebraic anti-unification

June 2, 2026 · Christian Anti\'c

This paper establishes an algebraic theory of anti-unification within universal algebra, extending the field beyond syntactic term representations. It defines key concepts like minimally general generalizations and explores their computability in finite algebras.

arXiv

Unsupervised Cognition

June 2, 2026 · Alfredo Ibias, Hector Antona, Guillem Ramirez-Miranda, Enric Guinovart, Eduard Alarcon

This study introduces a novel, primitive-driven unsupervised learning framework that outperforms state-of-the-art methods. The model exhibits cognitive-like behaviors, surpassing both unsupervised and supervised benchmarks.

arXiv

AdaCodec: A Predictive Visual Code for Video MLLMs

June 2, 2026 · Haowen Hou, Zhen Huang, Zheming Liang, Qingyi Si, Chenglin Li, Shuai Dong, Kele Shao, Ruilin Li, Dianyi Wang, Nan Duan, Jiaqi Wang

AdaCodec optimizes video MLLMs by transmitting only reference frames or compact change descriptions, reducing token usage to 1/7 of baselines. This approach boosts long-video performance and cuts latency from 9.26 to 1.62 seconds.

arXiv

Explainable AI Through a Democratic Lens: DhondtXAI for D'Hondt-Projected Feature Attribution

June 2, 2026 · Turker Berk Donmez

DhondtXAI is a novel, SHAP-independent XAI framework using the D'Hondt method for feature attribution. It ensures completeness and achieves high accuracy, matching SHAP benchmarks on healthcare datasets.

arXiv

Stop Wandering, Find the Keys: LLMs Discriminate Key States for Efficient Multi-Agent Exploration

June 2, 2026 · Yun Qu, Boyuan Wang, Yuhang Jiang, Jianzhun Shao, Yixiu Mao, Heming Zou, Chang Liu, Cheems Wang, Meiqin Liu, Xiangyang Ji

LEMAE uses LLMs to identify key states, guiding multi-agent exploration via SHIR rewards and KSMT memory. This reduces redundancy, outperforming SOTA methods with up to 10x speedup on benchmarks.

arXiv

Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

June 2, 2026 · Seojeong Park, Jiho Choi, Junyong Kang, Seonho Lee, Jaeyo Shin, Hyunjung Shim

This study addresses perceptual judgment bias in multimodal LLMs by introducing a perturbation-based dataset and a GRPO reward framework. The method significantly improves evaluation accuracy, consistency, and alignment with human judgment.

arXiv

Learning to Reduce Search Space for Generalizable Neural Routing Solver

June 2, 2026 · Changliang Zhou, Xi Lin, Zhenkun Wang, Qingfu Zhang

L2R is a novel learning-based dynamic search space reduction framework for neural routing solvers. It enables scalable, high-quality solutions for VRPs with up to 10 million nodes.

arXiv

Safety Must Precede the Deployment of Open-Ended AI

June 2, 2026 · Ivaxi Sheth, Jan Wehner, Sahar Abdelnabi, Ruta Binkyte, Mario Fritz

Open-ended AI’s autonomy creates unique safety risks like emergent misalignment. This paper urges prioritizing safety and proactive research before large-scale deployment.

arXiv

Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-Tuning and Can Be Mitigated by Machine Unlearning

June 2, 2026 · Yiwei Chen, Yuguang Yao, Yihua Zhang, Bingquan Shen, Gaowen Liu, Sijia Liu

Spurious correlations undermine VLM safety fine-tuning, enabling attacks and over-caution. Machine unlearning mitigates this by removing biased mappings, reducing attack success by 60% and unnecessary rejections by 84%.

arXiv

Finding the Minimal Parameter Budget for Implicit Reasoning: A Data Complexity Driven Scaling Law for Language Models

June 2, 2026 · Xinyi Wang, Shawn Tan, Shenbo Xu, Mingyu Jin, William Yang Wang, Rameswar Panda, Yikang Shen

This study establishes a scaling law linking minimal parameter budgets for implicit reasoning to data complexity. It finds models can reason over ~0.008 bits per parameter, guiding efficient LM sizing.

arXiv

Language Model Networks: Supervision-Efficient Learning through Dense Communication

June 2, 2026 · Shiguang Wu, Yaqing Wang, Quanming Yao

LMNet enables supervision-efficient learning by replacing discrete language communication with dense, differentiable vector exchanges between LLM nodes. This approach facilitates end-to-end optimization and significant performance gains with minimal training overhead.

arXiv

Formally Solving Answer-Construction Problems in Lean

June 2, 2026 · Jialiang Sun, Yuzhi Tang, Ao Li, Chris J. Maddison, Kuldeep S. Meel

The ECP framework addresses Lean answer-construction gaps by combining general LLMs for candidate generation with prover LLMs for verification. It outperforms baselines on PutnamBench and MathArena, ensuring admissible, formally verified solutions.

arXiv

EMoE: Training-Free Expert Disagreement for Uncertainty-Aware Text-to-Image Diffusion

June 2, 2026 · Lucas Berry, Axel Brando, Wei-Di Chang, Juan Camilo Gamboa Higuera, David Meger

EMoE estimates epistemic uncertainty in text-to-image diffusion via training-free expert disagreement. It outperforms baselines in ranking prompt alignment and reveals language-specific biases.

arXiv

Agent Guide: A Simple Agent Behavioral Watermarking Framework

June 2, 2026 · Kaibo Huang, Zipei Zhang, Zhongliang Yang, Linna Zhou

Agent Guide embeds watermarks via behavioral probability biases, enabling reliable detection without disrupting agent actions. It offers a robust, low-false-positive solution for tracing and securing intelligent agents in digital environments.