Global News Digest

Technology

arXiv

Agentic Transformers Provably Learn to Search via Reinforcement Learning

This study proves agentic transformers learn randomized depth-first search via RL, using specialized heads for action tracking and backtracking. The mechanism emerges from sparse feedback, enabling depth generalization and optimized search under imbalanced goals.

arXiv

Learning to Construct Practical Agentic Systems

This paper introduces a modular framework for practical agentic systems, balancing simplicity and cost with performance. It combines hand-engineered fixed workflows with novel learning techniques to optimize both accuracy and inference expenses.

arXiv

BAGEN: Are LLM Agents Budget-Aware?

BAGEN study reveals LLM agents lack inherent budget-awareness, often over-optimistically wasting resources. While training improves alerting and reduces costs, precise budget interval calibration remains challenging.

arXiv

From Rashomon Theory to PRAXIS: Efficient Decision Tree Rashomon Sets

PRAXIS efficiently approximates decision tree Rashomon sets, drastically reducing runtime and memory usage. It enables scalable modeling of robust, interpretable machine learning models for real-world data.

arXiv

The New Social Image: How AI Competency and AI Proactivity Influence Self- and Peer-Perceptions in the Workplace

Low AI competency/proactivity boosts ownership and satisfaction, while high performance may undermine it. Workplace AI design must prioritize human perceptions over pure metrics to preserve job meaningfulness.

arXiv

Continuous Reasoning for Vision-Language-Action

This paper introduces Continuous Reasoning for VLA, using a shared Gaussian latent interface to replace text for fine-grained control. It employs self-verification to ensure robust, generalizable action prediction.

arXiv

Civilizational Metamaterials: Engineering Coordination Under Capability Gradients and Structural Turbulence

This paper proposes a metamaterials-based framework to quantify governance, addressing AGI-induced "Freezing Equilibrium" through a constitutive law for institutional coordination. It outlines a three-tier provenance taxonomy and a trial to test hypotheses on preventing structural turbulence.

arXiv

InfoAtlas: A Foundation Model for Zero-Shot Statistical Dependence Estimate

InfoAtlas is a foundation model enabling instant, zero-shot mutual information estimation via a single forward pass. It matches state-of-the-art precision while offering 100x speed improvements and robust generalization.

arXiv

SEMBridge: Tagless-Final Program Semantics with Weakest-Precondition and Bounded-Checking Interpretations

SEMBridge is a Python framework generating weakest-precondition and bounded-checking interpretations from unified tagless-final programs. It synchronizes executable semantics with verification artifacts for rigorous program validation.

arXiv

Effects of Varying LLM Access on Essay Writing Behavior

Unrestricted LLM access reduced student authorship and creativity, while restricted use fostered strategic revision and ownership without compromising essay quality.

arXiv

When Softmax Fails at the Top: Extreme Value Corrections for InfoNCE

WEINCE corrects InfoNCE’s softmax limitations using extreme value theory, blending logits with batch statistics. It boosts frozen-feature performance on vision benchmarks without extra parameters.

arXiv

StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement

StressDream steers video world models toward high-impact, plausible outcomes by optimizing initial noise. This enables robust policy evaluation and improvement by identifying actions leading to undesirable results.

arXiv

Hyperbolic and Evidence-Prioritized Experts for Large Vision-Language Models

AsyMoE uses hyperbolic geometry and evidence-prioritized experts to address modality asymmetry in LVLMs. It outperforms baselines by up to 3.8% and reduces parameter activation by 25.45%.

arXiv

Synthetic Data from Cross-Domain Events for Large-Scale Recommendation Systems

SCALR generates synthetic user-item interactions for recommendation systems by translating source domain events, addressing data sparsity. This model-agnostic approach significantly improves performance in industrial A/B testing.

arXiv

ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate

ARCA mitigates token signal degeneration in LoRA-based RL by measuring adapter residuals for credit assignment. It achieves competitive MATH performance without learned reward models or value heads.

arXiv

Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance

TOPD improves on-policy distillation by using near-future guidance to target true reasoning divergences, boosting accuracy to 52.2% and outperforming standard methods on AIME benchmarks.

arXiv

Real2SAM2Real: Generative 3D Caches as Complementary Context for Video Diffusion

Real2SAM2Real enhances video diffusion with generative 3D caches for precise camera and object control. This approach ensures robust spatiotemporal consistency during complex motions and occlusions.

arXiv

Rethinking the Role of Temperature in Large Language Model Distillation

This study reveals temperature’s asymmetric impact on KL divergences, showing FKL outperforms RKL at higher temperatures. This overturns standard distillation practices, enabling simple KL methods to compete with state-of-the-art approaches.

arXiv

DRL-Based Pose Control for Double-Ackermann Robots Under Actuation Uncertainties

This study enhances double-Ackermann robot pose control using DRL and a sim-to-sim-to-real approach to address actuation uncertainties. The method achieves high transfer success to physical hardware without further tuning.

arXiv

LLMs Need Encoders for Semantic IDs Too

The paper introduces PrefixMem, a lightweight encoder for Semantic IDs in LLMs, demonstrating significant accuracy and recall improvements. This confirms that dedicated encoders, like those for vision, are essential for handling context-dependent non-textual modalities effectively.