Global News Digest

Technology

arXiv

Brain-Atlas-Guided Generative Counterfactual Attention for Explainable Cognitive Decline Diagnosis Using Multimodal Connectomes

GCAN uses brain-atlas-guided generative counterfactual attention on multimodal connectomes to diagnose cognitive decline. It ensures explainability by analyzing functional and structural connectivity disparities.

arXiv

Recognize Your Orchestrator: An Entropy Dynamics Perspective for LLM Multi-Agent Systems

This study uses entropy dynamics to identify LLM orchestrators, revealing a "Reasoning Trap" where context limits hinder performance. It offers insights for designing stable multi-agent systems.

arXiv

FlowTime: Towards Continuous Generative Watch Time Prediction via Flow-based Personalized Priors

FlowTime introduces continuous generative watch time prediction using flow-based personalized priors, addressing multimodality and latency issues in existing methods. It also releases TimeRec, an open-source WTP library.

arXiv

Science Earth: Towards A Planet-Scale Operating System for AI-Native Scientific Discovery

Science Earth proposes a planet-scale AI operating system where capabilities autonomously collaborate via the EACN protocol. Validated by rapid theoretical corrections and large-scale single-cell analysis, it enables dynamic, open-ended scientific discovery.

arXiv

SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems

SkillSmith co-evolves skills and tools via an ecological utility model, addressing static tool limitations. It outperforms baselines, especially in complex, multi-skill tasks.

arXiv

Self-Healing Agentic Orchestrators for Reliable Tool-Augmented Large Language Model Systems

This study introduces a self-healing agentic orchestrator for LLMs, achieving 98.8% task success by intelligently recovering from orchestration failures. It significantly outperforms static and retry-based baselines, eliminating semantic silent errors through verifier-guided recovery.

arXiv

Early Diagnosis of Wasted Computation in Multi-Agent LLM Systems via Failure-Aware Observability

This study introduces a failure-aware observability framework to diagnose wasted computation in multi-agent LLM systems. It correlates failure patterns with trace indicators, revealing significant operational failures and rising token costs across GAIA validation levels.

arXiv

GovAI-Pipe: A Layered AI Governance Pipeline for Citizen-Facing AI in Turkey's e-Government Gateway

GovAI-Pipe bridges Turkey’s e-Devlet AI deployment with global governance standards via a four-tier pipeline. It ensures compliant, auditable AI through pre-deployment validation, runtime monitoring, and post-incident accountability.

arXiv

Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts

This paper introduces an A*-inspired multi-agent framework to generate obfuscated LLM prompts that induce commonsense hallucinations. It outperforms exhaustive search methods in attack success rate and efficiency.

arXiv

GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning

GuidaPA uses Federated Learning to train a privacy-preserving chatbot for Italian Public Administration. It achieves high-quality responses comparable to centralized models while keeping data local.

arXiv

Don't Ask the LLM to Track Freshness: A Deterministic Recipe for Memory Conflict Resolution

Replacing LLM-based conflict resolution with deterministic version tracking significantly boosts memory accuracy. This approach achieves up to 94.8% on single-hop tasks, outperforming existing systems.

arXiv

Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence

This paper proposes a categorical framework for agentic AI in scientific discovery, distinguishing it from mere search. It demonstrates the model's utility in materials science through protein mechanics and fiber network case studies.

arXiv

Transferring Information Across Interventions in Causal Bayesian Optimization

This paper introduces graph-coupled causal Bayesian optimization, leveraging shared causal parameters across interventions to improve efficiency. Empirical results confirm its superior performance in refining costly systems by transferring knowledge between related interventions.

arXiv

A Minimalist Brain-Computer Musical Interface for Real-Time Emotion-Driven Sonification: System Design and Preliminary Evaluation

A study tested a brain-computer musical interface using frontal alpha asymmetry for emotion-driven sonification. Results showed the method failed to reliably detect intentional emotional states, highlighting significant individual variability.

arXiv

An Enigma of Artificial Reason: Investigating the Production-Evaluation Gap in Large Reasoning Models

Large Reasoning Models exhibit a significant gap between producing and evaluating reasoning, driven by answer confirmation bias. Despite solving problems accurately, they fail to detect logical flaws, often inventing justifications for incorrect steps.

arXiv

TERRA: Task-Embedded Reasoning and Representation Architecture for Cross-Domain Applications

TERRA formalizes cross-domain transfer via domain-invariant cores and adapters, quantifying fidelity with Gromov-Wasserstein distance. It establishes a transfer bound linking latent error to decision regret for action-conditioned models.

arXiv

Joint Agent Memory and Exploration Learning via Novelty Signals

JAMEL jointly trains memory and exploration via novelty signals, offering annotation-free supervision. It outperforms baselines in unseen environments while reducing token usage.

arXiv

RoleCDE:Benchmarking and Mitigating Role-Alignment Trade-offs in Role-Playing Agents

RoleCDE benchmarks role-playing agents, revealing they prioritize alignment over role values. Fine-tuning on this data mitigates this decoupling while maintaining general performance.

arXiv

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

TRON is an online RL environment generating unlimited, verifiable visual reasoning data via a generator-verifier program. It boosts multimodal model performance across ten benchmarks without extra data collection.

arXiv

Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization

This paper introduces Joint Neighborhood Optimization (JNO) to address ripple effects in knowledge editing. JNO simultaneously resolves coordination and leakage pressures, improving propagation and preservation metrics by at least 7.0%.