Gate the Filter, Not the Message: Node-Channel Mixtures for Pre-Propagation GNNs
Title: Gate the Filter, Not the Message: Node-Channel Mixtures for Pre-Propagation GNNs
Abstract
Pre-propagation graph neural networks (PPGNNs) achieve high scalability by shifting all graph-dependent computations to a preprocessing phase, allowing training to occur exclusively on the resulting dense hop features. However, a notable paradox in this framework is that more intricate hop aggregators do not consistently surpass simpler alternatives; on numerous benchmarks, basic MLP-based aggregators perform on par with, or even better than, those employing hop attention. We re-examine this phenomenon through the lens of graph filtering. Within a precomputed diffusion basis, the primary distinction among existing PPGNNs lies not in raw aggregator capacity, but in the sharing patterns of filter coefficients across nodes and feature channels. Specifically, MLP-based models typically learn filters that are adaptive to channels but largely shared across nodes, whereas hop-attention models generally produce node-dependent mixtures that are predominantly shared across channels. This observation highlights a gap in current PPGNN architectures: the lack of joint node- and channel-adaptive filtering under the constraints of pre-propagation computation. To address this, we introduce FilterMoE, a mixture-of-experts PPGNN that utilizes a compact bank of learnable Chebyshev filter experts. These experts are routed jointly across nodes and channels via a 3D gating tensor. Evaluated across eleven homophilic and heterophilic benchmarks, FilterMoE outperforms robust PPGNN baselines on nine datasets and secures the top rank on all three large-scale benchmarks, yielding an average improvement of 1.53 points in test scores. These findings demonstrate that joint node-channel filter routing serves as a resilient alternative to the need for dataset-specific hop-aggregator selection.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





