Sign Lock-In: Randomly Initialized Weight Signs Persist and Bottleneck Sub-Bit Model Compression
Title: The Persistence of Random Signs: How Initialization Lock-In Constrains Sub-Bit Model Compression
Abstract: Sub-bit model compression aims to store parameters at densities lower than one bit per weight. As magnitude compression becomes increasingly aggressive, the sign bit emerges as a fixed-cost bottleneck. Our analysis of Transformers, CNNs, and MLPs reveals that learned sign matrices resist low-rank approximation and appear spectrally identical to an independent and identically distributed (i.i.d.) Rademacher baseline. This inherent randomness establishes the fundamental lower bound for sub-bit compression, a phenomenon we term the "one-bit wall." However, despite this apparent stochasticity, the majority of weights preserve their initial signs; sign changes predominantly happen through rare boundary crossings near zero. This suggests that the randomness observed in sign patterns is largely a legacy of the initialization phase. We formalize this observation through "sign lock-in theory," which employs a stopping-time analysis of sign flips induced by Stochastic Gradient Descent (SGD) noise. We demonstrate that, under bounded updates and a rare re-entry condition into a small neighborhood of zero, the count of effective sign flips follows a geometric tail distribution. Leveraging this mechanism, we propose a novel training approach from scratch that utilizes low-rank sign templates, effectively circumventing the one-bit wall.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



