Structure and Scale in Simplicial Sequence Modelling
Title: The Interplay of Structure and Scale in Simplicial Sequence Modeling
Abstract:
Current large-scale deep learning systems are characterized by two prominent empirical observations: the existence of behavioral scaling laws, which describe predictable improvements in performance as model size increases, and the emergence of specific mechanisms, such as structured internal representations and neural circuits within deep neural networks. We propose that these two phenomena are intrinsically linked, suggesting that systematic shifts in behavior stem from corresponding, predictable alterations in the underlying computational architecture. This study presents initial evidence supporting this hypothesis. Specifically, we identify a correlation between performance scaling trends and the evolution of representations in small transformer models trained to forecast the outputs of a hidden Markov model. Notably, in this context, residual activations have been shown to linearly encode a belief distribution across latent states within a probability simplex.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





