LiveBand: Live Accompaniment Generation in the Audio Domain
Title: LiveBand: Real-Time Audio Domain Accompaniment Synthesis
Abstract:
This paper introduces LiveBand, a novel real-time system designed to create high-fidelity musical accompaniments from live audio inputs while strictly adhering to causal constraints. The proposed approach utilizes a causal transformer generator trained within the continuous latent space of a pre-existing causal audio autoencoder. Training is supervised adversarially at the sequence level via a discriminator. During operation, the generator relies solely on the causally accessible mix context and Gaussian noise at each timestep, predicting accompaniment latents without access to future mix frames or ground-truth targets.
The training process is executed in a single parallel forward pass under causal masking, whereas streaming inference operates autoregressively using a rolling attention state. This design ensures that computational requirements for both training and inference are identical, thereby removing the need for teacher forcing and mitigating exposure bias. Evaluated on a multi-instrument music accompaniment benchmark, LiveBand outperforms previous methods in objective metrics regarding audio quality, beat alignment, and mix adherence. Furthermore, it supports real-time streaming generation on consumer-grade hardware without requiring lookahead into future data.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



