Mitigating Bias in Locally Constrained Decoding via Tractable Proposals
Title: Reducing Bias in Locally Constrained Decoding Through Efficient Proposal Mechanisms
Abstract: Large language model outputs frequently struggle to adhere to strict formatting requirements, such as JSON schemas. Current locally constrained decoding (LCD) techniques attempt to enforce these rules by myopically eliminating invalid next tokens, a process that introduces sampling bias and harms overall performance. While recent studies have applied sequential Monte Carlo (SMC) methods to address this bias, creating effective proposal distributions and potential functions remains difficult. This paper introduces a universal framework for generating proposals and potentials for SMC sampling from the conditional language model distribution $p_{\mathrm{lm}}( \cdot \mid \mathrm{constraint})$. We demonstrate that constraints defined by finite automata can be tensorized for high-efficiency GPU execution, enabling the creation of globally constrained decoding (GCD) proposals. Furthermore, by exploiting the structural similarity between tensorized finite automata and hidden Markov models, we employ circuit multiplication to derive probabilistic GCD (P-GCD) proposals. These proposals capture both logical constraints and probabilistic nuances of the target distribution. We assess the performance of (P-)GCD on function calling, keyword-based generation, and SQL generation tasks. Our results indicate that, within an identical SMC sampling framework, (P-)GCD achieves faster convergence to the target distribution than LCD proposals, requiring significantly fewer particles.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





