Technology News - Global News Digest

arXiv

Bridging the Last Mile of Time Series Forecasting with LLM Agents

June 2, 2026 · Yuhua Liao, Zetian Wang, Qiangqiang Nie, Zhenhua Zhang

This study introduces an LLM-agent framework to address the "last-mile" of time series forecasting by integrating business context into statistical predictions. The system enhances accuracy and auditability through reasoning, memory, and reflection mechanisms.

arXiv

Tracking the Behavioral Trajectories of Adapting Agents

June 2, 2026 · Jonah Leshin, Manish Shah, Ian Timmis

This paper introduces a framework to quantify agent traits by analyzing text embedding differences in skill files. Validated with high accuracy, it enables agents to monitor each other's behavioral evolution via a secure protocol.

arXiv

ClinEnv: An Interactive Multi-Stage Long Horizon EHR Environment for Agents

June 2, 2026 · Yuxing Lu, Yushuhong Lin, Wenqi Shi, J. Ben Tamo, Xukai Zhao, Jinzhuo Wang, May Dongmei Wang

ClinEnv evaluates LLMs as physicians via interactive, multi-stage inpatient simulations. Results show models struggle with management decisions and redundant querying, revealing gaps hidden by static benchmarks.

arXiv

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment

June 2, 2026 · Hao Li, Jingkun An, Zijun Song, Pengyu Zhu, Rui Li, Hao Wang, Wendi Feng, Yesheng Liu, Lijun Li, Jin-Ge Yao, Lei Sha

SafeSteer aligns LLMs via localized on-policy distillation on safety tokens, minimizing capability loss. It achieves robust safety with only 100 harmful samples, requiring less than 1% of the data used by prior methods.

arXiv

A Lightweight Deep Learning-based Model for Ranking Influential Nodes in Complex Networks

June 2, 2026 · Mohammed A. Ramadhan, Abdulhakeem O. Mohammed

The 1D-CGS model combines 1D-CNNs and GraphSAGE to efficiently rank influential nodes in complex networks. It outperforms existing methods in accuracy and speed, achieving superior ranking precision with minimal computational cost.

arXiv

A Novel Data Augmentation Strategy for Robust Deep Learning Classification of Biomedical Time-Series Data: Application to ECG and EEG Analysis

June 2, 2026 · Mohammed Guhdar, Ramadhan J. Mstafa, Abdulhakeem O. Mohammed

This study proposes a unified CNN framework with novel time-domain data augmentation to robustly classify ECG and EEG signals. It effectively addresses class imbalance and outperforms existing benchmarks on benchmark datasets.

arXiv

BenHalluEval: A Multi-Task Hallucination Evaluation Framework for Large Language Models on Bengali

June 2, 2026 · Shefayat E Shams Adib, Ahmed Alfey Sani, Ekramul Alam Esham, Ajwad Abrar, Ishmam Tashdeed, Md Taukir Azam Chowdhury

BenHalluEval introduces the first Bengali hallucination benchmark, revealing significant LLM calibration gaps. Its dual-track metric exposes limitations of single-track evaluations and prompting in low-resource contexts.

arXiv

Empathic and agentic artificial intelligence in nursing: perspectives on a human-centered framework for cancer care navigation in the United States

June 2, 2026 · Tyra Girdwood, Saba Kheirinejad, Parnian Kheirkhah Rahimabad, Brianna M. White, Robert L Davis, David L Schwartz, Arash Shaban-Nejad

This article proposes a human-centered AI framework for US cancer care navigation, augmenting nurses’ empathy and agency to address resource gaps and improve care coordination.

arXiv

RuleEdit: Failure-Guided Human-AI Model Editing with Prospective Impact Preview

June 2, 2026 · Min Hun Lee, Justin Yu Feng Teo

RuleEdit is a human-AI framework using failure detection and impact previews to guide model editing. It significantly improved stroke rehabilitation assessment performance and feedback quality, revealing a local-global trade-off.

arXiv

A phenomenon of AI-conformity: how algorithms change human moral decision-making

June 2, 2026 · Yana Venerina, Dmitry Koch, Nare Meloyan, Gerda Prutko, Valeriia Lelik, Victoria Taova, Andrey Kurpatov

This study reveals that AI reasoning influences human moral judgments as strongly as social pressure. It challenges the idea that morality is immune to algorithmic conformity.

arXiv

DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset

June 2, 2026 · Shannan Liu, Peifeng Li, Yaxin Fan, Qiaoming Zhu

DraDDP is the first multimodal, multi-party dialogue discourse parsing dataset, featuring 6,374 utterances from TV dramas. It demonstrates that multimodal data significantly improves parsing accuracy.

arXiv

Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

June 2, 2026 · Hao Xu, Rite Bo, Fausto Giunchiglia, Yingji Li, Rui Song

DOPA enhances LLM robustness by using out-of-distribution proxies to retrieve diverse demonstrations when target domains are inaccessible. It leverages Mahalanobis distance to ensure variety, significantly improving performance in OOD scenarios.

arXiv

Examine Clinicians' Modification of Hedging Language in Ambient AI Documentation: A Comparative Study of AI Drafts and Final Notes

June 2, 2026 · Yiliang Zhou, Yawen Guo, Di Hu, Sairam Sutari, Emilie Chow, Steven Tam, Danielle Perret, Deepti Pandita, Kai Zheng

Clinicians increased hedging language in ambient AI drafts, favoring additions over removals. Significant variability in these linguistic shifts was observed across vendors and specialties.

arXiv

SortingHat: Redefining Operating Systems Education with a Tailored Digital Teaching Assistant

June 2, 2026 · Yifan Zhang, Xinkui Zhao, Zuxin Wang, Zhengyi Zhou, Guanjie Chen, Shuiguang Deng, Jianwei Yin

SortingHat is an AI teaching assistant using RAG and MARL to personalize Operating Systems education. It provides adaptive 3D mentorship, customized exercises, and automated grading to improve learning outcomes.

arXiv

Understanding Stigmatizing Language in Clinical Documentation: A Paired Comparison of Ambient AI Drafts and Clinician Finalized Notes

June 2, 2026 · Yiliang Zhou, Yawen Guo, Sairam Sutari, Jasmine Dhillon, Alexandra L. Beck, Emilie Chow, Steven Tam, Danielle Perret, Deepti Pandita, Gelareh Sadigh, Archana J. McEligot, Kai Zheng

A study of 66,297 notes reveals clinician editing of AI drafts increases stigmatizing language from 21.4% to 24.0%. This suggests human review inadvertently adds bias to electronic health records.

arXiv

AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

June 2, 2026 · Aria Nourbakhsh, Adelaide Danilov, Christoph Schommer, Salima Lamsiyah

AEyeDE detects AI-generated text by analyzing attention attribution maps via a CNN. It outperforms baselines, offering robust, interpretable detection across diverse scenarios.

arXiv

SENSE: Semantic Embedding Navigation with Soft-gated Evaluation for Retrieval-based Speculative Decoding

June 2, 2026 · Shaowen Chen, Zhicheng Liao, Hongwei Wang

SENSE improves retrieval-based speculative decoding by using semantic embedding navigation and soft-gated evaluation to bypass strict lexical dependencies. It achieves up to 3.26x speedup while maintaining generation quality across LLaMA and Qwen models.

arXiv

CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards

June 2, 2026 · Wei Tian, Yuhao Zhou, Man Lan

CSRP improves Chinese text correction via chain-of-thought reasoning and efficiency-aware reinforcement learning, reducing over-correction. It achieves state-of-the-art results on NACGEC and CSCD benchmarks, outperforming GPT-4.

arXiv

lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

June 2, 2026 · Alexey Tikhonov, Alexey Ivanov

The "lmfaoooo" team won SemEval-2026 Task 1 by using a preference model trained on pairwise comparisons to select the best humor from diverse candidates.

arXiv

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

June 2, 2026 · Yichuan Mo, Yukun Jiang, Yanbo Shi, Mingjie Li, Michael Backes, Yang Zhang, Yisen Wang

TrustLDM benchmarks trustworthiness in Language Diffusion Models, revealing alignment deterioration with malicious post-contexts. It introduces TrustLDM-Auto to systematically identify vulnerabilities across safety, privacy, and fairness.