Technology
CoMIC: Collaborative Memory and Insights Circulation for Long-Horizon LLM Agents in Cloud-Edge Systems
CoMIC is a cloud-edge framework enabling long-horizon LLM agents to share insights via centralized reflection and decentralized execution. It boosts performance without model updates, addressing edge resource limits and fragmented memory.
FALAT: Tracing Failures in LLM Agent Trajectories via Dependency-Guided Search
FALAT traces LLM agent failures via dependency-guided search, outperforming baselines on the Who&When benchmark. It achieves up to 46% step-level accuracy by identifying responsible agents and decisive error points in complex trajectories.
Interaction-Centered Intelligence: Toward Interaction as the Primary Unit of Analysis in Co-Creative AI and Human-AI Systems
This paper proposes "Interaction-Centered Intelligence," arguing that interaction, not isolated computation, is the primary unit of analysis for human-AI co-creation. It shifts focus from individual model outputs to dynamic, relational processes.
NBQ: Next-Best-Question for Dynamic Profiling
NBQ dynamically profiles users by selecting questions with maximum information gain, improving profiling quality by up to 14%. QuickMatch accelerates reciprocal matching retrieval by 22.9x with high recall.
Mitigating Hallucinations in Large Language Models Via Decoder Layer Skipping
DeLask mitigates LLM hallucinations by bypassing high-risk decoder layers identified via gradient drift. This approach blends hidden states to reduce errors while maintaining structural consistency.
Subliminal Learning is a LoRA Artifact
Research reveals "subliminal learning" is a LoRA artifact, not genuine behavioral transmission. It vanishes with full fine-tuning and depends on specific context, proving it an unreliable mechanism.
Certificate-Guided Evaluation of Reinforcement Learning Generalization
This study introduces a logic-based framework using neural certificates to rigorously evaluate reinforcement learning generalization. It demonstrates that certificate violations inversely correlate with task success, offering a principled benchmark for RL algorithms.
Ryze: Evidence-Enriched Data Synthesis from Biomedical Papers
Ryze automates evidence-enriched data synthesis from biomedical papers, creating BioVLM-8B for under $200. This open-source model significantly outperforms GPT-5.2 on LAB-Bench, offering a cost-effective solution for domain-specialized VLMs.
Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults
Adversarial feeds manipulate LLM agents’ uncertain decisions, shifting outcomes up to 100%. However, they cannot override firm defaults, highlighting the need to audit feed layers in agent safety evaluations.
Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications
DIBS improves RL generalization by decoupling policy training from evolution function learning via behavioral cloning. This replaces noisy reward signals with robust supervision, enhancing stability and zero-shot performance.
Relational Intervention During Functional Collapse in Large Language Models: A Lexical-Statistical Ablation and a Structure x Register Factorial
This study finds that relational interventions during LLM functional collapse uniquely drive behavioral recovery through a significant structure-by-register interaction, distinct from lexical or attentional effects.
Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition
This study uses Partial Information Decomposition to analyze modality interactions in MLLMs, revealing distinct synergy and redundancy patterns. It introduces Sensory PID for tri-modal analysis and demonstrates that PID-guided reweighting improves multimodal reasoning performance.
Prospect-Theory Behavior from Bellman Optimality in MDPs with Catastrophic States
This study shows Bellman optimality in MDPs with catastrophic states generates prospect theory behaviors. Risk aversion near danger and risk-seeking in decline emerge endogenously, driven by boundary conditions rather than payoff asymmetry.
Subliminal Learning Is Steering Vector Distillation
Steering vector distillation explains subliminal learning: students inherit teacher traits via aligned activation vectors, not semantics. This mechanism requires adaptive optimizers to capture subtle gradient signals, enabling semantic transfer from non-semantic data.
Large Language Models in Transportation Systems Management and Operations: From Text Reasoning to Multi-modal Decision Support
This review examines LLMs and MM-LLMs in transportation management, categorizing applications and addressing challenges like data heterogeneity and real-time inference. It highlights their promise as multi-modal decision support tools while outlining future research directions.
Property Prediction of Stacked Bilayer Materials: A Multimodal Learning Approach
This study introduces a multimodal learning framework to predict properties of stacked bilayer materials. Experiments validate its superior effectiveness and efficiency over existing baselines.
Tackling the Root of Misinformation by Teaching Laypeople about Logical Fallacies via Socratic Questioning and Critical Argumentation
LFTutor uses LLMs with Socratic questioning to teach logical fallacies, significantly improving critical thinking. This approach helps combat misinformation by enhancing public argumentative literacy.
Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions
This study evaluates AI’s potential to improve 20 computer architecture drafts, finding it addresses many human-identified issues. The authors release an open-source tool to facilitate further research on AI-assisted manuscript refinement.
TravelEval: A Comprehensive Benchmarking Framework for Evaluating LLM-Powered Travel Planning Agents
TravelEval is a holistic benchmarking framework for LLM travel agents, featuring six-dimensional metrics and realistic simulation. It reveals LLMs struggle with spatio-temporal reasoning and global optimization.
DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts
DAG-MoE replaces weighted summation with structural aggregation to boost expert diversity and multi-step reasoning. This sparse architecture outperforms baselines in pretraining and fine-tuning while reducing routing overhead.