arXiv

An Empirical Audit of Input Encoders for Multi-Channel Signal Transformers

Title: An Empirical Evaluation of Input Encoders for Multi-Channel Signal Transformers

Abstract: When Transformers process multi-channel scalar signals, they are required to embed $C$ concurrent values into a single $d_{\text{model}}$-dimensional vector for each time step. This study conducts an empirical audit of eight distinct input encoding strategies. These methods range from a shared-scalar baseline and per-channel linear projections to techniques involving orthogonality regularization, nonlinear MLP stems, block-partitioned concatenation, channel-independent and channel-as-token architectures, and projected positional encodings. We evaluated these approaches on a synthetic benchmark, specifically engineered to render channel identity significant, as well as on the ETTh1 dataset to validate findings with real-world data. Performance was assessed using next-step negative log-likelihood (NLL).

The primary finding indicates a state of practical near-equivalence among a broad "top tier" of methods. Specifically, the standard per-channel linear projection (nn.Linear(C, $d_{\text{model}}$)) performs on par with every other encoder in this top tier, differing only by small amounts that are statistically significant but practically negligible. Conversely, two encoders demonstrated clear inferiority: the shared-scalar baseline, which suffers from collapse due to explicit information-theoretic constraints, and a channel-independent baseline inspired by PatchTST, which underperformed across both benchmarks and exhibited universal overfitting on the synthetic data.

Further analysis using paired tests clarified two minor performance gaps. First, passing the sinusoidal positional encoding through a learned linear layer provides a slight advantage at small values of $C$. A direct geometric analysis reveals that this improvement stems from positional-channel orthogonalization. Second, a nonlinear MLP stem offers a marginal edge at the largest $C$ values tested, though this advantage diminishes as more training data becomes available. Based on these results, we recommend adopting nn.Linear(C, $d_{\text{model}}$) as the default choice, resorting to more complex architectures only when specific task requirements necessitate it. All code and data required to reproduce the experiments presented in this paper are accessible at https://github.com/OssiLehtinen/channel-encoder-audit.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...