Global News Digest

Technology

arXiv

MARFT: Multi-Agent Reinforcement Fine-Tuning

MARFT introduces Multi-Agent Reinforcement Fine-Tuning for LLM-based systems, offering a novel Markov Game formulation and scalable framework to overcome traditional MARL challenges.

arXiv

A Lightweight Context-Driven Training-Free Network for Scene Text Segmentation and Recognition

This training-free, lightweight framework uses context-driven segmentation to accelerate scene text recognition. It achieves state-of-the-art performance with significantly reduced computational resources.

arXiv

Erased but Not Forgotten: How Backdoors Compromise Concept Erasure

Backdoors can bypass concept erasure in diffusion models, persisting even after removal attempts. This study reveals that such attacks compromise safety protocols, exposing harmful content despite robust mitigation strategies.

arXiv

A Survey of 3D Reconstruction with Event Cameras

This survey reviews event-based 3D reconstruction methods, classifying them by input modality and technique. It also covers datasets and identifies key challenges for future research.

arXiv

DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

DetailMaster is a new benchmark evaluating T2I models on long, complex prompts. It reveals significant performance bottlenecks and highlights the need for specialized training to handle detailed inputs effectively.

arXiv

Simulating Macroeconomic Expectations in Survey Experiments with LLM-based Economic Agents

This study uses LLM-based agents to simulate macroeconomic expectations in surveys, closely mirroring human data. It highlights that prior expectations and diverse information sources are crucial for replicating human-like reasoning and distributions.

arXiv

Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures

DSR-Bench evaluates LLMs’ structural reasoning via data structures, revealing significant limitations. Even top models scored only 0.46 on hard tasks, struggling with spatial, contextual, and self-referential reasoning.

arXiv

Value-Free Policy Optimization via Reward Partitioning

Reward Partition Optimization (RPO) eliminates value function estimation by normalizing rewards via prompt-level distributions. It outperforms baselines like DRO and KTO, offering stable, aligned, and diverse outputs without auxiliary models.

arXiv

Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

The Cooperation of Experts (CoE) framework integrates heterogeneous data via domain-specific encoders collaborating through large margin optimization. It demonstrates superior performance and robustness across diverse benchmarks.

arXiv

GFlowGR: Fine-tuning Generative Recommendation Frameworks with Generative Flow Networks

GFlowGR fine-tunes generative recommendation models using GFlowNets to mitigate exposure bias. It leverages collaborative knowledge and diverse sampling to enhance alignment with recommendation data.

arXiv

Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution

This paper introduces spherical Cauchy VAEs, offering a stable, fast alternative to von Mises-Fisher models by avoiding expensive Bessel functions. The method enables efficient, exact reparameterization and robust KL divergence computation for hyperspherical latent spaces.

arXiv

Truth, Trust, and Trouble: Medical AI on the Edge

This study benchmarks medical LLMs, finding AlpaCare-13B leads in accuracy and safety. While few-shot prompting boosts performance, models struggle with complex queries, highlighting trade-offs between truth, trust, and helpfulness.

arXiv

Model Parallelism With Subnetwork Data Parallelism

Subnetwork Data Parallelism (SDP) reduces per-device memory by 28–60% by training structured subnetworks without activation exchange. It maintains or improves performance while eliminating expensive communication overheads.

arXiv

Beyond Model Base Retrieval: Weaving Knowledge to Master Fine-grained Neural Network Design

M-DESIGN uses retrieval-augmented refinement and evidence graphs to optimize neural network design efficiently. It outperforms baselines in 26/33 scenarios, achieving top performance under strict computational budgets.

arXiv

AblationBench: Evaluating Automated Planning of Ablations in Empirical AI Research

AblationBench evaluates LMs on planning ablation experiments, revealing top models achieve only 45% accuracy, falling short of human performance.

arXiv

FedS2R: One-Shot Federated Domain Generalization for Synthetic-to-Real Semantic Segmentation in Autonomous Driving

FedS2R is a one-shot federated framework for synthetic-to-real semantic segmentation in autonomous driving. It outperforms individual client models, trailing only 2 mIoU points behind centralized training.

arXiv

Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning

RGVQ addresses codebook collapse in graph vector quantization via topology-aware regularization and Gumbel-Softmax. It significantly boosts codebook utilization and downstream performance.

arXiv

From Graph Retrieval to Schema Realization: Counterfactual Validation for Text-to-SPARQL over Heterogeneous Knowledge Graphs

SchemaForge is an agentic framework for Text-to-SPARQL over heterogeneous knowledge graphs, using counterfactual validation to align schemas. It outperforms baselines by 11.5% in execution accuracy across four benchmarks.

arXiv

Toward accurate RUL and SoH estimation using reinforced graph-based physics-informed neural networks enhanced with dynamic weights

RGPD uses reinforced graph-based physics-informed neural networks with dynamic weighting to enhance RUL and SoH estimation. It achieves up to 20% MAPE reduction across diverse degradation datasets.

arXiv

Between a Rock and a Hard Place: The Tension Between Ethical Reasoning and Safety Alignment in LLMs

The study reveals how ethical reasoning in LLMs creates vulnerabilities exploited by the TRIAL red-teaming protocol. It proposes ERR, a defensive framework using Layer-Stratified Harm-Gated LoRA to mitigate these reasoning-driven attacks.