Technology News - Global News Digest

arXiv

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

June 2, 2026 · Wenjing Lu, Zerui Tao, Yuning Qiu, Dongping Zhang, Yang Yang, Qibin Zhao

This paper introduces an adversarial fine-tuning method for CLIP that reparameterizes outputs as Dirichlet distributions to balance accuracy and uncertainty. It restores calibrated uncertainty under adversarial perturbations while preserving zero-shot generalization.

arXiv

VocSim: A Training-free Benchmark for Zero-shot Content Identity in Single-source Audio

June 2, 2026 · Maris Basha, Anja Zai, Sabine Stoll, Richard Hahnloser

VocSim is a training-free benchmark evaluating zero-shot content identity in 125k single-source audio clips. It reveals generalization gaps in low-resource speech while validating embeddings via bioacoustic and HEAR benchmarks.

arXiv

Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

June 2, 2026 · Yuxuan Qiao, Dongqin Liu, Hongchang Yang, Wei Zhou, Songlin Hu

This study introduces TOP-R, a privacy risk where LLM agents leak sensitive data by combining non-sensitive tool outputs. The authors propose TOP-Align, a post-training method that significantly reduces leakage compared to prompt-based safeguards.

arXiv

Ev-Trust: An Evolutionarily Stable Trust Mechanism for Decentralized LLM-Based Multi-Agent Service Economies

June 2, 2026 · Jiye Wang, Shiduo Yang, Ting Qiao, Jiayu Qin, Jianbin Li, Yu Wang, Yuanhe Zhao

Ev-Trust introduces an evolutionarily stable trust mechanism for decentralized LLM-agent economies, using cross-validation and revenue integration to ensure cooperative stability and reduce fraud.

arXiv

Control of a Twin Rotor using Twin Delayed Deep Deterministic Policy Gradient (TD3)

June 2, 2026 · Zeyad Gamal, Youssef Mahran, Ayman El-Badawy

This study uses the TD3 reinforcement learning algorithm to control a Twin Rotor Aerodynamic System. Simulations and lab tests show it outperforms PID controllers under wind disturbances.

arXiv

Reinforcement Learning Position Control of a Quadrotor Using Soft Actor-Critic (SAC)

June 2, 2026 · Youssef Mahran, Zeyad Gamal, Ayman El-Badawy

This study uses Soft Actor-Critic RL to control quadrotor thrust vectors instead of rotor RPMs. The approach achieves faster training and superior, smoother path-following performance.

arXiv

Dynamic Entropy Tuning in Reinforcement Learning Low-Level Quadcopter Control: Stochasticity vs Determinism

June 2, 2026 · Youssef Mahran, Zeyad Gamal, Ayman El-Badawy

This study compares dynamic entropy tuning in SAC against TD3 for quadcopter control. Results show dynamic entropy significantly improves performance by enhancing exploration and mitigating catastrophic forgetting.

arXiv

Uncovering Competency Gaps in Large Language Models and Their Benchmarks

June 2, 2026 · Maty Bohacek, Nino Scherrer, Nicholas Dufour, Thomas Leung, Christoph Bregler, Stephanie C. Y. Chan

This study introduces an unsupervised method using sparse autoencoders to automatically detect competency gaps in LLMs and benchmarks. It reveals hidden model weaknesses and benchmark deficiencies, offering a complementary tool for refining evaluation frameworks.

arXiv

MGRegBench: A Novel Benchmark Dataset with Anatomical Landmarks for Mammography Image Registration

June 2, 2026 · Svetlana Krasnova, Emiliya Starikova, Ilia Naletov, Andrey Krylov, Dmitry Sorokin

MGRegBench is a novel benchmark dataset with anatomical landmarks for mammography registration, enabling standardized, reproducible comparisons of classical and deep learning methods.

arXiv

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

June 2, 2026 · Taekyung Ki, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Sung Ju Hwang

Avatar Forcing enables real-time, interactive head avatars using diffusion forcing and label-free optimization. It achieves 500ms latency and highly expressive reactions, outperforming baselines in speed and user preference.

arXiv

VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models

June 2, 2026 · Jianke Zhang, Xiaoyu Chen, Qiuyue Wang, Mingsheng Li, Yanjiang Guo, Yucheng Hu, Jiajun Zhang, Shuai Bai, Junyang Lin, Jianyu Chen

VLM4VLA reveals that VLM general capabilities and specialized embodied skills poorly predict VLA performance. Instead, the visual module is the primary bottleneck, and adding control-relevant supervision to it yields consistent improvements.

arXiv

Paradoxical noise preference in RNNs

June 2, 2026 · Noah Eckstein, Manoj Srinivasan

Contrary to standard practice, continuous-time RNNs often perform best with noise during inference, as removing it biases outputs near activation nonlinearities. This effect stems from noise-induced shifts in the network's stochastic dynamics.

arXiv

Safe-FedLLM: Delving into the Safety of Federated Large Language Models

June 2, 2026 · Mingxiang Tao, Yu Tian, Wenxuan Tu, Yue Yang, Xue Yang, Xiangyan Tang

Safe-FedLLM defends Federated LLMs by using lightweight classifiers to detect malicious LoRA updates. This three-tiered framework ensures robustness against attacks without compromising performance or training speed.

arXiv

Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics

June 2, 2026 · Subhadeep Roy, Gagan Bhatia, Steffen Eger

This study identifies prototypicality bias in multimodal metrics, which favor stereotypical over semantically accurate images. The PROTOBIAS benchmark exposes these flaws, highlighting the gap between automated scores and human judgment.

arXiv

FastSLM: Hierarchical Temporal Abstraction for Efficient Long-Form Speech Adaptation

June 2, 2026 · Junseok Lee, Sangyong Lee, Chang-Jae Chun

FastSLM uses Hierarchical Temporal Abstractor to compress long-form audio by 97% without losing context, achieving SOTA performance with fewer resources.

arXiv

Hot-Start Chinese Language Modeling:Visual Glyphs Accelerate Sample-Efficient Learning

June 2, 2026 · Shuyang Xiang, Hao Guan

Visual glyphs accelerate early Chinese language model training but converge to similar final accuracy as token IDs. This "hot-start" effect stems from pre-encoded radical structures, offering faster alignment without enhancing ultimate model capacity.

arXiv

DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion

June 2, 2026 · Hanlin Zhang, Daxin Tan, Dehua Tao, Xiao Chen, Haochen Tan, Yunhe Li, Yuchen Cao, Linqi Song

DSA-Tokenizer disentangles speech into semantic and acoustic tokens via flow matching, enabling high-fidelity reconstruction and voice cloning. It achieves efficient, controllable generation with low error rates, proving effective for large-model speech tasks.

arXiv

SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models

June 2, 2026 · Bingxin Xu, Yuzhang Shang, Binghui Wang, Emilio Ferrara

SilentDrift exploits VLA action chunking to launch stealthy backdoor attacks using C2-continuous perturbations. It achieves 93.2% success with <2% poisoning while maintaining high clean task performance.

arXiv

MASCOT: Towards Multi-Agent Socio-Collaborative Companion Systems

June 2, 2026 · Yiyang Wang, Yiqiao Jin, Alex Cabral, Josiah Hester

MASCOT is a multi-agent framework using bi-level optimization to prevent persona collapse and sycophancy. It enhances role consistency and dialogue diversity, outperforming state-of-the-art baselines in socio-collaborative companionship.

arXiv

Physics-Encoded Inverse Modeling for Arctic Snow Depth Prediction

June 2, 2026 · Akila Sampath, Vandana Janeja, Jianwu Wang

PhysE-Inv predicts Arctic snow depth using physics-encoded inverse modeling with LSTM and contrastive learning. It outperforms baselines, reducing MSE by 24.7% and boosting parameter estimation by 17.3%.