Technology News - Global News Digest

arXiv

TuneAgent: Agentic Operating System Kernel Tuning with Reinforcement Learning

June 2, 2026 · Hongyu Lin, Yuchen Li, Haoran Luo, Zhenghong Lin, Libo Zhang, Mingjie Xing, Yanjun Wu

TuneAgent uses RL-driven LLMs to autonomously tune Linux kernels, achieving up to 5.6% performance gains. It ensures valid configurations via structured rewards and a two-phase training approach.

arXiv

Language-Native Materials Processing Design by Lightly Structured Text Database and Reasoning Large Language Model

June 2, 2026 · Yuze Liu, Zhaoyuan Zhang, Xiangsheng Zeng, Yihe Zhang, Leping Yu, Liu Yang, Lejia Wang, Xi Yu

This framework optimizes materials synthesis by using lightly structured text and reasoning LLMs to extract procedural logic from unstructured data. It successfully streamlined boron nitride nanosheet production, reducing trial-and-error cycles through iterative, evidence-based protocol refinement.

arXiv

Position: Beyond Sensitive Attributes, ML Fairness Should Quantify Structural Injustice via Social Determinants

June 2, 2026 · Zeyu Tang, Alex John London, Atoosa Kasirzadeh, Sarah Stewart de Ramirez, Peter Spirtes, Kun Zhang, Sanmi Koyejo

This paper argues ML fairness must quantify structural injustice via social determinants, not just sensitive attributes. Auditing these determinants before mitigation prevents new injustices and addresses systemic inequities.

arXiv

Towards a Physics Foundation Model

June 2, 2026 · Florian Wiesner, Zo\"e J. Gray, Matthias Wessling, Stephen Baek

The General Physics Transformer (GPhyT) demonstrates that a single model can simulate diverse physical phenomena, outperforming specialized solvers and enabling zero-shot generalization. This work establishes a viable foundation for universal Physics Foundation Models.

arXiv

Deep Learning as the Disciplined Construction of Tame Objects

June 2, 2026 · Gilles Bareilles, Allen Gehret, Johannes Aspman, Jana Lep\v{s}ov\'a, Jakub Mare\v{c}ek

This paper uses tame geometry to provide convergence guarantees for stochastic gradient descent in nonsmooth, nonconvex deep learning. It frames deep learning models as compositions of tame functions, offering a rigorous mathematical framework for AI analysis.

arXiv

T-POP: Test-Time Personalization with Online Preference Feedback

June 2, 2026 · Zikun Qu, Min Zhang, Mingze Kong, Xiang Li, Zhiwei Shang, Zhiyong Wang, Yikun Ban, Shuang Qiu, Yao Shu, Zhongxiang Dai

T-POP enables real-time LLM personalization by learning user preferences via online feedback and dueling bandits, without updating model parameters. It effectively solves the cold-start problem, outperforming existing baselines with rapid, data-efficient adaptation.

arXiv

End-to-End Deep Learning for Predicting Metric Space-Valued Outputs

June 2, 2026 · Yidong Zhou, Su I Iao, Hans-Georg M\"uller

E2M predicts metric space-valued outputs via weighted Fréchet means, preserving intrinsic geometry without surrogate embeddings. It achieves state-of-the-art results on diverse structured data, including networks and distributions.

arXiv

v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound

June 2, 2026 · Zhengpeng Shi, Yanpeng Zhao, Jianqun Zhou, Yuxuan Wang, Qinrong Cui, Wei Bi, Songchun Zhu, Bo Zhao, Zilong Zheng

v-HUB is a new benchmark for evaluating multimodal LLMs on video humor understanding. It reveals that audio cues significantly aid models in comprehending humor compared to visual-only inputs.

arXiv

Distillation of Large Language Models via Concrete Score Matching

June 2, 2026 · Yeongmin Kim, Donghyeok Shin, Mina Kang, Byeonghu Na, Il-Chul Moon

Concrete Score Distillation (CSD) improves LLM distillation by aligning relative logit differences, overcoming softmax blurring and shift invariance limits. It consistently outperforms recent methods in fidelity and diversity across various benchmarks.

arXiv

Make a Video Call with LLM: A Measurement Campaign over Six Mainstream Apps

June 2, 2026 · Jiayang Xu, Xiangjie Huang, Zijie Li, Antariksh Verma, Zili Meng

This study benchmarks six LLM video chat apps across quality, latency, and overhead, revealing that AI capabilities, not network latency, primarily drive user experience.

arXiv

Simultaneous Multi-objective Alignment Across Verifiable and Non-verifiable Rewards

June 2, 2026 · Yiran Shen, Yu Xia, Jonathan Chang, Prithviraj Ammanabrolu

MAHALO aligns LLMs across verifiable and subjective rewards using PRM-guided decoding and multi-action heads, enabling concurrent optimization with minimal interference and flexible user control.

arXiv

Verifying Meta-Awareness via Predictive Rewards in Reasoning Models

June 2, 2026 · Yoonjeon Kim, Doohyuk Jang, Eunho Yang

MAPR enhances reasoning models by predicting rollout statistics to optimize processing, boosting accuracy by 83.18% on AIME25 and accelerating training by 1.28x.

arXiv

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

June 2, 2026 · Hyung Gyu Rho

MADPO uses a reward model to adaptively weight DPO loss per sample, improving granular control over heterogeneous preference data. It outperforms existing methods by stabilizing training and preserving valuable signals.

arXiv

Domain-Shift-Aware Conformal Prediction for Large Language Models

June 2, 2026 · Zhexiao Lin, Yuanyuan Li, Neeraj Sarna, Yuanyuan Gao, Michael von Gablenz

DS-CP adapts conformal prediction for LLMs under domain shifts by weighting calibration samples based on test prompt proximity. It ensures reliable coverage and computational efficiency, enhancing trustworthy uncertainty quantification.

arXiv

HRTFformer: A Spatially-Aware Transformer for Individual HRTF Upsampling in Immersive Audio Rendering

June 2, 2026 · Xuyi Hu, Jian Li, Shaojie Zhang, Stefan Goetz, Lorenzo Picinali, Ozgur B. Akan, Aidan O. T. Hogg

HRTFformer is a transformer-based model that upsamples sparse HRTF data using spherical harmonics and attention mechanisms. It outperforms existing methods in accuracy and perceptual realism for immersive audio rendering.

arXiv

Value Flows

June 2, 2026 · Perry Dong, Chongyi Zheng, Chelsea Finn, Dorsa Sadigh, Benjamin Eysenbach

Value Flows uses flow-based models to estimate full return distributions, improving decision-making by quantifying state uncertainty. It achieves a 1.3x success rate boost across 62 benchmark tasks.

arXiv

StreamingVLM: Real-Time Understanding for Infinite Video Streams

June 2, 2026 · Ruyi Xu, Guangxuan Xiao, Yukang Chen, Liuning He, Yao Lu, Song Han

StreamingVLM enables real-time comprehension of infinite video streams via efficient KV caching and SFT. It achieves 8 FPS on H100, outperforming GPT-4O mini and boosting general VQA capabilities.

arXiv

SHERLOCK: Towards Dynamic Knowledge Adaptation in LLM-enhanced E-commerce Risk Management

June 2, 2026 · Nan Lu, Yurong Hu, Jiaquan Fang, Yan Liu, Rui Dong, Yiming Wang, Rui Lin, Shaoyi Xu

SHERLOCK integrates domain knowledge with LLMs to automate e-commerce fraud detection. It boosts investigation throughput by 386.7% and maintains accuracy via a self-evolving data flywheel.

arXiv

Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods?

June 2, 2026 · Zihan Chen, Yiming Zhang, Hengguang Zhou, Zenghui Ding, Yining Sun, Cho-Jui Hsieh

This study reveals that current RL benchmarks fail to distinguish genuine progress due to data leakage, hiding poor generalization. It proposes new principles for robust evaluation to accurately assess RL methods.

arXiv

Catch-Only-One: Non-Transferable Examples for Model-Specific Authorization

June 2, 2026 · Zihan Wang, Zhiyong Ma, Zhongkui Ma, Shuofeng Liu, Akide Liu, Derui Wang, Minhui Xue, Guangdong Bai

Non-Transferable Examples (NTEs) recode data into model-specific subspaces, enabling authorized models to access information while blocking unauthorized ones. This training-free method ensures purpose limitation without relying on data perturbation or controlled training processes.