Global News Digest

Technology

arXiv

UR-JEPA: Uniform Rectifiability as a Regularizer for Joint-Embedding Predictive Architectures

UR-JEPA uses uniform rectifiability to prevent representation collapse, outperforming LeJEPA with higher accuracy, lower variance, and a smaller model on remote-sensing datasets.

arXiv

Computation-Aware Kalman Filtering with Model Selection for Neural Dynamics

CASSM integrates model selection into Kalman filtering for neural dynamics, enabling efficient, scalable inference. It outperforms deep networks in uncertainty calibration while handling large state-spaces with fewer trials.

arXiv

Emergent Transfer of a Physics Foundation Model from Simulation to Laboratory Turbulence

A physics foundation model fine-tuned on simulation data successfully predicts real-world turbulence, bridging the simulation-experiment gap. This demonstrates effective generalization to noisy laboratory environments without experimental training.

arXiv

MURMUR: An Efficient Inference System for Long-Form ASR

Murmur optimizes long-form ASR via dual-level strategies, achieving single-pass accuracy with 4.2x lower latency. It balances chunk size and KV cache eviction for efficient, high-precision speech recognition.

arXiv

Hierarchical Online Prompt Mutation with Dual-Loop Feedback for Guardrailed Evidence Document Generation: A Production-Evaluation Case Study

HOPM, a hierarchical prompt mutation framework, significantly outperforms baselines in evidence document generation. It boosts win rates to 45.7% and quality scores to 4.40 via dual-loop feedback.

arXiv

Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX

Crazyflow is a GPU-accelerated JAX drone simulator enabling sub-centimeter accuracy and mid-flight RL training. It outperforms existing tools by orders of magnitude, supporting large-scale swarms and diverse robotic algorithms.

arXiv

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

ClawHub’s study reveals low consensus among VirusTotal, static analysis, and SkillSpector, showing they detect distinct threat types. This highlights the need for layered governance in securing AI agent skills.

arXiv

LLM Consortium for Software Design Refinement: A Controlled Experiment on Multi-Agent Collaboration Topologies

A study of 520 LLM experiments found adversarial topologies and cross-model reviews yield the best software designs, while parallel merges fail due to token starvation.

arXiv

On the Limits of Token Reduction for Efficient Unified Vision Language Training

Token reduction accelerates unified VLM training but harms synergy by forcing parameters into divergent pathways. Efficiency requires acceleration techniques that preserve shared structures across tasks.

arXiv

TimeSage-MT: A Multi-Turn Benchmark for Evaluating Agentic Time Series Reasoning

TimeSage-MT is a multi-turn benchmark evaluating agentic time series reasoning across 240 tasks. It reveals current LLMs struggle with memory and decision-making in dynamic, cumulative workflows.

arXiv

Move the Query, Not the Cache: Characterizing Cross-Instance Latent Attention Redistribution Across GPU Fabrics

This study compares routing queries versus migrating KV cache in cross-instance MLA. Using H100 clusters, it develops cost models to determine the most efficient strategy based on network fabric and request characteristics.

arXiv

Agent Operating Systems (AOS): Integrating Agentic Control Planes into, and Beyond, Traditional Operating Systems

The paper proposes Agent Operating Systems (AOS) to integrate agentic control planes into traditional OSs, addressing limitations in scheduling, security, and state management for probabilistic AI agents.

arXiv

TN-SHAP-G: Graph-Structured Tensor Network Surrogates for Shapley Values and Interactions

TN-SHAP-G uses graph-structured tensor networks to efficiently compute Shapley values and interactions. This deterministic surrogate avoids Monte Carlo noise and scales effectively to large graphs.

arXiv

ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts

ProbMoE introduces differentiable probabilistic routing for Mixture-of-Experts, using exact marginal probabilities to approximate gradients. It enables robust Exact-$k$ and adaptive Dynamic-$k$ routing, improving expert utilization and efficiency.

arXiv

Compliance-Scored Best-of-N Guardrail Orchestration for Multimodal Document Generation in Payments Dispute Defense

This study introduces a compliance-scored guardrail orchestration for multimodal document generation, achieving 91% compliance and significantly improving dispute defense win rates.

arXiv

GJDNet: Robust Graph Neural Networks via Joint Disentangled Learning Against Adversarial Attacks

GJDNet enhances GNN robustness against adversarial attacks by jointly disentangling node representations and decision spaces. It mitigates structure-feature mismatches and stabilizes decision boundaries for superior node classification.

arXiv

Self-Conditioned Positional HNSW for Overlap-Aware Retrieval in Chunked-Document RAG Systems: Method and Industrial Evidence-Quality Audit

SCP-HNSW reduces redundant evidence in RAG by enforcing positional gaps during retrieval. Industrial audits confirm improved evidence quality and retrieval efficiency.

arXiv

Defenses & Enablers For Skill Injection Attacks on Terminal Based Agents

This study shows guardian-based defenses reduce skill injection attack success rates by over 50% in LLM agents. Dynamic guardians prove particularly robust against attack reframing techniques.

arXiv

Identifying High-Confidence Social Biases in LLMs for Trustworthy Conversational Tutoring Agents

This study identifies high-confidence social biases in LLMs used for tutoring, revealing that models are often overconfident in incorrect judgments. These findings highlight critical risks to feedback quality and student learning in educational AI systems.

arXiv

Estimating Mutual Information between Time Series and Temporal Event Sequences Across Diverse Analysis Tasks

This paper introduces a robust, nonparametric mutual information estimator for continuous-discrete temporal data, eliminating bias from quantization and redundancy. Validated across four tasks, it outperforms existing methods in accuracy and interpretability.