Evidence-Gated LLM Priors for Multi-Objective Bayesian Optimization
Title: Leveraging Evidence-Gated LLM Priors in Multi-Objective Bayesian Optimization
Abstract
While large language models (LLMs) are increasingly deployed as heuristic advisors in black-box optimization, their recommendations and self-assessed confidence levels often lack calibration with respect to the actual objective values. This misalignment is particularly acute in multi-objective Bayesian optimization, a context where distinct objectives demand specialized expertise, rendering an LLM potentially valuable for one goal while misleading for another. This study investigates methods to integrate LLM-generated expert priors into discrete multi-objective Bayesian optimization without uncritical reliance. We introduce an objective-specific reputation-market framework that treats each expert-objective combination as a falsifiable source of prior knowledge. Within this system, expert weights are dynamically updated based on observed feedback, subject to temporal discounting, and regulated by a market-level trust metric. Furthermore, we propose a decoupled counterfactual gate that offers three operational modes: utilizing the LLM prior with confidence, using it without confidence, or completely disregarding the prior.
Empirical evaluations across controlled synthetic stress tests and three molecule optimization benchmarksāutilizing expert priors generated by \qwenflash{}ādemonstrate that dynamic, objective-wise calibration enhances robustness compared to static LLM priors. However, raw LLM confidence does not consistently prove advantageous. Our results indicate that on the ESOL dataset, confidence is positively correlated with prediction error; on FreeSolv, it offers some benefit; and on Lipophilicity, disregarding confidence yields the best performance. Regarding the gating mechanism, our fixed three-arm counterfactual gate outperforms the initial counterfactual variant on both ESOL and FreeSolv. Additionally, an investigation into a margin portfolio strategy yields a significant negative finding: margin selection should be driven by acquisition-aware criteria rather than relying solely on one-step prior error.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




