Technology
ResNet-34 with Lightweight Decoder for Accurate and Efficient Segmentation of Fetal Brain MRI
This study proposes a ResNet-34 model with a lightweight decoder for efficient fetal brain MRI segmentation. It achieves high accuracy (90.33% Dice) and speed, outperforming baselines on the FeTA 2021 dataset.
Quantum Algorithm for Distributed Reduction of Entanglements (QADR): A Trainable and Simulation-Efficient QML Framework
QADR reduces VQC simulation memory from $\mathcal{O}(2^n)$ to $\mathcal{O}(n \cdot 2^{2d+1})$ via distributed entanglement reduction. It outperforms classical models on MNIST and NASA data, scaling to 2000 features where global VQCs fail.
ChronosAD: Leveraging Time Series Foundation Models for Accurate Anomaly Detection
ChronosAD uses time series foundation models and a Temporal Block to enhance anomaly detection. It outperforms state-of-the-art methods by 4.72% AUC across 11 datasets.
PSG-Nav: Probabilistic Scene Graph Navigation via Multiverse Decision Making
PSG-Nav uses probabilistic scene graphs and multiverse decision-making to handle perception uncertainty in open-vocabulary navigation. It achieves state-of-the-art success rates on MP3D, HM3D, and HSSD benchmarks.
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories
SkillAdaptor enables training-free, step-level adaptation for LLM agents by pinpointing failures in trajectories. It outperforms baselines on WebShop, PinchBench, and Claw-Eval with stable, auditable updates.
A Communication-Centric 6G-LLM Architecture for Scalable Tactical Autonomous Defense Vehicle Networks
This study proposes a 6G-LLM architecture for tactical vehicle networks, achieving 75% lower latency and 89% reduced overhead. Simulations show significant gains in coordination efficiency and mission success rates at scale.
TukaBench: A Culturally Grounded Jailbreak Benchmark for African Languages
TukaBench reveals that African languages and cultural contexts lower LLM refusal rates, exposing safety gaps. It introduces "Deflection" metrics and highlights reliability issues in low-resource language evaluations.
DiffuSent: Towards a Unified Diffusion Framework for Aspect-Based Sentiment Analysis
DiffuSent unifies ABSA subtasks via non-autoregressive diffusion, outperforming baselines with +2.48 F1 gains and 181x faster inference.
Digital Twin-Assisted Adaptive Multi-Agent DRL for Intelligent Spectrum and Resource Management in Open-RAN UAV-Enabled 6G Networks
This study proposes a digital twin-enhanced adaptive multi-agent DRL framework for intelligent spectrum and resource management in Open-RAN UAV-enabled 6G networks. Simulations demonstrate significant improvements in spectral efficiency, throughput, and energy consumption.
BRo-JEPA: Learning Modular Arithmetic in Latent Space
BRo-JEPA introduces a block-rotation predictor to encode modular arithmetic in latent space, achieving 99.46% zero-shot accuracy. This demonstrates that aligning architecture with task structure enables robust generalization of abstract algebraic rules.
Needles at Scale: LLM-Assisted Target Selection for Windows Vulnerability Research
This paper introduces Symbolicate-Enrich-Sample, an LLM-assisted pipeline that prioritizes Windows functions for vulnerability research. It reduces 7.2 million functions to a manageable 22,000 candidates for efficient analysis.
Beyond Access: Guided LLM Scaffolding for Independent Learning in Undergraduate Statistics
Guided LLM scaffolding in statistics improved independent quiz performance and reasoning, unlike unrestricted access. This suggests structured AI interaction fosters genuine learning better than mere tool availability.
FreqLite: A Lightweight Frequency-Decomposed Linear Model with Adaptive Reversible Normalization for Robust Long-Term Time-Series Forecasting
FreqLite is a lightweight linear model using frequency decomposition and adaptive normalization for robust long-term forecasting. It outperforms PatchTST with 4x fewer parameters and 2.2x faster speed on commodity hardware.
Efficient Exploration for Iterative Nash Preference Optimization
This paper introduces an explicitly exploratory iterative Nash Preference Optimization algorithm that achieves an $O(\sqrt{T})$ regret bound, eliminating exponential KL-sensitivity. It further proves that using a minimax oracle refines this to $O(\log T)$.
Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing
Dr. DocBench is a rigorous benchmark for expert-level document parsing, featuring 4,514 challenging pages across 52 domains. It exposes current VLM limitations, driving advancements in complex document intelligence.
Consistent and Distinctive: LLM Benchmark Efficiency via Maximum Independent Set Prompt Selection on Similarity Graphs
This study proposes a graph-based prompt selection using Maximum Independent Sets to reduce LLM benchmark costs. Results show high consistency in rankings and significant prompt reduction, proving the method's efficiency.
Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical Memory
MAAD is a multi-agent framework using RAG and hierarchical memory to autonomously design software architectures. It outperforms MetaGPT in modularity, completeness, and traceability across case studies.
Neural Network Compression by Approximate Differential Equivalence
This paper introduces compressing neural networks by grouping neurons with similar dynamics via approximate differential equivalence. It outperforms standard pruning methods while maintaining accuracy.
CEAR: Certified Ensemble Adversarial Robustness in DNNs
CEAR combines empirical and certified defenses via ensemble training with Gaussian noise and temperature variations. It achieves higher certified accuracy and robustness radii across standard datasets compared to baselines.
On the Evaluation of Spiking Neural Network Configurations for Network Intrusion Detection
This study evaluates 27 SNN variants for intrusion detection, revealing latency encoding and LeakyParallel neurons yield the best accuracy (92.11%) and speed.