Technology
Brain-Atlas-Guided Generative Counterfactual Attention for Explainable Cognitive Decline Diagnosis Using Multimodal Connectomes
GCAN uses brain-atlas-guided generative counterfactual attention on multimodal connectomes to diagnose cognitive decline. It ensures explainability by analyzing functional and structural connectivity disparities.
Recognize Your Orchestrator: An Entropy Dynamics Perspective for LLM Multi-Agent Systems
This study uses entropy dynamics to identify LLM orchestrators, revealing a "Reasoning Trap" where context limits hinder performance. It offers insights for designing stable multi-agent systems.
FlowTime: Towards Continuous Generative Watch Time Prediction via Flow-based Personalized Priors
FlowTime introduces continuous generative watch time prediction using flow-based personalized priors, addressing multimodality and latency issues in existing methods. It also releases TimeRec, an open-source WTP library.
Science Earth: Towards A Planet-Scale Operating System for AI-Native Scientific Discovery
Science Earth proposes a planet-scale AI operating system where capabilities autonomously collaborate via the EACN protocol. Validated by rapid theoretical corrections and large-scale single-cell analysis, it enables dynamic, open-ended scientific discovery.
SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems
SkillSmith co-evolves skills and tools via an ecological utility model, addressing static tool limitations. It outperforms baselines, especially in complex, multi-skill tasks.
Self-Healing Agentic Orchestrators for Reliable Tool-Augmented Large Language Model Systems
This study introduces a self-healing agentic orchestrator for LLMs, achieving 98.8% task success by intelligently recovering from orchestration failures. It significantly outperforms static and retry-based baselines, eliminating semantic silent errors through verifier-guided recovery.
Early Diagnosis of Wasted Computation in Multi-Agent LLM Systems via Failure-Aware Observability
This study introduces a failure-aware observability framework to diagnose wasted computation in multi-agent LLM systems. It correlates failure patterns with trace indicators, revealing significant operational failures and rising token costs across GAIA validation levels.
GovAI-Pipe: A Layered AI Governance Pipeline for Citizen-Facing AI in Turkey's e-Government Gateway
GovAI-Pipe bridges Turkey’s e-Devlet AI deployment with global governance standards via a four-tier pipeline. It ensures compliant, auditable AI through pre-deployment validation, runtime monitoring, and post-incident accountability.
Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts
This paper introduces an A*-inspired multi-agent framework to generate obfuscated LLM prompts that induce commonsense hallucinations. It outperforms exhaustive search methods in attack success rate and efficiency.
GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning
GuidaPA uses Federated Learning to train a privacy-preserving chatbot for Italian Public Administration. It achieves high-quality responses comparable to centralized models while keeping data local.
Don't Ask the LLM to Track Freshness: A Deterministic Recipe for Memory Conflict Resolution
Replacing LLM-based conflict resolution with deterministic version tracking significantly boosts memory accuracy. This approach achieves up to 94.8% on single-hop tasks, outperforming existing systems.
Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence
This paper proposes a categorical framework for agentic AI in scientific discovery, distinguishing it from mere search. It demonstrates the model's utility in materials science through protein mechanics and fiber network case studies.
Transferring Information Across Interventions in Causal Bayesian Optimization
This paper introduces graph-coupled causal Bayesian optimization, leveraging shared causal parameters across interventions to improve efficiency. Empirical results confirm its superior performance in refining costly systems by transferring knowledge between related interventions.
A Minimalist Brain-Computer Musical Interface for Real-Time Emotion-Driven Sonification: System Design and Preliminary Evaluation
A study tested a brain-computer musical interface using frontal alpha asymmetry for emotion-driven sonification. Results showed the method failed to reliably detect intentional emotional states, highlighting significant individual variability.
An Enigma of Artificial Reason: Investigating the Production-Evaluation Gap in Large Reasoning Models
Large Reasoning Models exhibit a significant gap between producing and evaluating reasoning, driven by answer confirmation bias. Despite solving problems accurately, they fail to detect logical flaws, often inventing justifications for incorrect steps.
TERRA: Task-Embedded Reasoning and Representation Architecture for Cross-Domain Applications
TERRA formalizes cross-domain transfer via domain-invariant cores and adapters, quantifying fidelity with Gromov-Wasserstein distance. It establishes a transfer bound linking latent error to decision regret for action-conditioned models.
Joint Agent Memory and Exploration Learning via Novelty Signals
JAMEL jointly trains memory and exploration via novelty signals, offering annotation-free supervision. It outperforms baselines in unseen environments while reducing token usage.
RoleCDE:Benchmarking and Mitigating Role-Alignment Trade-offs in Role-Playing Agents
RoleCDE benchmarks role-playing agents, revealing they prioritize alignment over role values. Fine-tuning on this data mitigates this decoupling while maintaining general performance.
TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL
TRON is an online RL environment generating unlimited, verifiable visual reasoning data via a generator-verifier program. It boosts multimodal model performance across ten benchmarks without extra data collection.
Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization
This paper introduces Joint Neighborhood Optimization (JNO) to address ripple effects in knowledge editing. JNO simultaneously resolves coordination and leakage pressures, improving propagation and preservation metrics by at least 7.0%.