Global News Digest

Technology

arXiv

CoMIC: Collaborative Memory and Insights Circulation for Long-Horizon LLM Agents in Cloud-Edge Systems

CoMIC is a cloud-edge framework enabling long-horizon LLM agents to share insights via centralized reflection and decentralized execution. It boosts performance without model updates, addressing edge resource limits and fragmented memory.

arXiv

FALAT: Tracing Failures in LLM Agent Trajectories via Dependency-Guided Search

FALAT traces LLM agent failures via dependency-guided search, outperforming baselines on the Who&When benchmark. It achieves up to 46% step-level accuracy by identifying responsible agents and decisive error points in complex trajectories.

arXiv

Interaction-Centered Intelligence: Toward Interaction as the Primary Unit of Analysis in Co-Creative AI and Human-AI Systems

This paper proposes "Interaction-Centered Intelligence," arguing that interaction, not isolated computation, is the primary unit of analysis for human-AI co-creation. It shifts focus from individual model outputs to dynamic, relational processes.

arXiv

NBQ: Next-Best-Question for Dynamic Profiling

NBQ dynamically profiles users by selecting questions with maximum information gain, improving profiling quality by up to 14%. QuickMatch accelerates reciprocal matching retrieval by 22.9x with high recall.

arXiv

Mitigating Hallucinations in Large Language Models Via Decoder Layer Skipping

DeLask mitigates LLM hallucinations by bypassing high-risk decoder layers identified via gradient drift. This approach blends hidden states to reduce errors while maintaining structural consistency.

arXiv

Subliminal Learning is a LoRA Artifact

Research reveals "subliminal learning" is a LoRA artifact, not genuine behavioral transmission. It vanishes with full fine-tuning and depends on specific context, proving it an unreliable mechanism.

arXiv

Certificate-Guided Evaluation of Reinforcement Learning Generalization

This study introduces a logic-based framework using neural certificates to rigorously evaluate reinforcement learning generalization. It demonstrates that certificate violations inversely correlate with task success, offering a principled benchmark for RL algorithms.

arXiv

Ryze: Evidence-Enriched Data Synthesis from Biomedical Papers

Ryze automates evidence-enriched data synthesis from biomedical papers, creating BioVLM-8B for under $200. This open-source model significantly outperforms GPT-5.2 on LAB-Bench, offering a cost-effective solution for domain-specialized VLMs.

arXiv

Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults

Adversarial feeds manipulate LLM agents’ uncertain decisions, shifting outcomes up to 100%. However, they cannot override firm defaults, highlighting the need to audit feed layers in agent safety evaluations.

arXiv

Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications

DIBS improves RL generalization by decoupling policy training from evolution function learning via behavioral cloning. This replaces noisy reward signals with robust supervision, enhancing stability and zero-shot performance.

arXiv

Relational Intervention During Functional Collapse in Large Language Models: A Lexical-Statistical Ablation and a Structure x Register Factorial

This study finds that relational interventions during LLM functional collapse uniquely drive behavioral recovery through a significant structure-by-register interaction, distinct from lexical or attentional effects.

arXiv

Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition

This study uses Partial Information Decomposition to analyze modality interactions in MLLMs, revealing distinct synergy and redundancy patterns. It introduces Sensory PID for tri-modal analysis and demonstrates that PID-guided reweighting improves multimodal reasoning performance.

arXiv

Prospect-Theory Behavior from Bellman Optimality in MDPs with Catastrophic States

This study shows Bellman optimality in MDPs with catastrophic states generates prospect theory behaviors. Risk aversion near danger and risk-seeking in decline emerge endogenously, driven by boundary conditions rather than payoff asymmetry.

arXiv

Subliminal Learning Is Steering Vector Distillation

Steering vector distillation explains subliminal learning: students inherit teacher traits via aligned activation vectors, not semantics. This mechanism requires adaptive optimizers to capture subtle gradient signals, enabling semantic transfer from non-semantic data.

arXiv

Large Language Models in Transportation Systems Management and Operations: From Text Reasoning to Multi-modal Decision Support

This review examines LLMs and MM-LLMs in transportation management, categorizing applications and addressing challenges like data heterogeneity and real-time inference. It highlights their promise as multi-modal decision support tools while outlining future research directions.

arXiv

Property Prediction of Stacked Bilayer Materials: A Multimodal Learning Approach

This study introduces a multimodal learning framework to predict properties of stacked bilayer materials. Experiments validate its superior effectiveness and efficiency over existing baselines.

arXiv

Tackling the Root of Misinformation by Teaching Laypeople about Logical Fallacies via Socratic Questioning and Critical Argumentation

LFTutor uses LLMs with Socratic questioning to teach logical fallacies, significantly improving critical thinking. This approach helps combat misinformation by enhancing public argumentative literacy.

arXiv

Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions

This study evaluates AI’s potential to improve 20 computer architecture drafts, finding it addresses many human-identified issues. The authors release an open-source tool to facilitate further research on AI-assisted manuscript refinement.

arXiv

TravelEval: A Comprehensive Benchmarking Framework for Evaluating LLM-Powered Travel Planning Agents

TravelEval is a holistic benchmarking framework for LLM travel agents, featuring six-dimensional metrics and realistic simulation. It reveals LLMs struggle with spatio-temporal reasoning and global optimization.

arXiv

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

DAG-MoE replaces weighted summation with structural aggregation to boost expert diversity and multi-step reasoning. This sparse architecture outperforms baselines in pretraining and fine-tuning while reducing routing overhead.