Global News Digest

arXiv

Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) for Exponential Compression of Deep Neural Networks

Title: Achieving Exponential Compression in Deep Neural Networks via Automatically Differentiable Nonlinear Tensor Networks (ADNTNs)

Abstract

This study investigates Automatically Differentiable Nonlinear Tensor Networks (ADNTNs), a class of structured weight generators. In this framework, compact core tensors are optimized end-to-end using reverse-mode automatic differentiation (AD). Conceptually, ADNTNs extend the principles of tensor factorization and low-rank adaptation. Rather than relying on a single low-rank matrix update, an ADNTN constructs substantial weight tensors by leveraging a hierarchy of small core tensors, nonlinear activation functions, and optional lateral mixing tensors.

The research highlights three specific architectures: Tree Tensor Networks (TTNs), augmented TTNs (aTTNs) incorporating boundary disentanglers, and Multi-scale Entanglement Renormalisation Ansatze (MERA). The proposed formulation is versatile, supporting nonlinear activations, task-specific objectives, batching, and execution schedules tailored to hardware constraints. However, the paper maintains a crucial distinction: differentiating a contraction program is not equivalent to eliminating the computational cost of contractions. Automatic differentiation does not circumvent expenses associated with large intermediate values, suboptimal contraction orders, or the exact contraction of general loopy tensor networks.

Extensive simulations conducted on layers from AlexNet and VGG-16 demonstrate significant efficiency gains. In the tested configurations, per-layer compression ratios ranged from approximately $2000\times$ to $77000\times$. In many instances, model accuracy matched that of dense baselines, and in several VGG-16 scenarios, it even surpassed them. While these findings are preliminary rather than definitive, they indicate that ADNTNs offer a promising, mathematically rigorous, and hardware-conscious pathway to significantly smaller neural networks, contingent upon the co-design of optimization strategies, contraction schedules, and deployment kernels.


Source: arXiv Generated at: 2026-06-02 00:00:00 UTC

Related Articles

Schroders Renewable Unit Targets AI Assets as Power Demand Soars
Bloomberg

Schroders Renewable Unit Targets AI Assets as Power Demand Soars

Schroders’ renewable unit targets AI infrastructure, pivoting to meet soaring energy demand from artificial intelligence...

State Street's Paglia on SBI Group Partnership, ETFs
Bloomberg

State Street's Paglia on SBI Group Partnership, ETFs

State Street's Paglia discusses the SBI Group partnership and ETFs, but the source text is missing. Please provide the a...

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’
Bloomberg

Nvidia Boss Says Workers Should Be Paid ‘as Much as Possible’

Nvidia CEO Jensen Huang advocates for paying workers “as much as possible,” emphasizing maximum compensation. This stanc...

TSE Talking With Regulator For Easing ETF Listing Rules
Bloomberg

TSE Talking With Regulator For Easing ETF Listing Rules

The Tokyo Stock Exchange is discussing with regulators to ease ETF listing rules. This aims to simplify market access an...

S&P DJI CEO on Japan Markets, Mega IPOs
Bloomberg

S&P DJI CEO on Japan Markets, Mega IPOs

S&P DJI CEO discusses Japan's financial markets and major IPOs.