arXiv

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

June 2, 2026 · Hejia Zhang, Zhongming Yu, Chia-Tung Ho, Haoxing Ren, Brucek Khailany, Jishen Zhao · Original Source

Title: LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

Abstract:

While large language model (LLM) agents that leverage execution feedback present a promising approach for tool-based learning, the high cost and latency associated with acquiring such signals often render online reinforcement learning (RL) impractical. This challenge is particularly acute in high-coverage hardware verification, a domain that depends heavily on industrial simulators and non-differentiable execution outputs. To address these limitations, we introduce LLM4Cov, an offline agent-learning framework that reimagines verification as single-step state transitions directed by deterministic evaluators. Within this structure, we implement execution-validated data curation, policy-aware agentic data synthesis, and worst-state-prioritized sampling to facilitate scalable learning under strict execution constraints. Additionally, we present a reality-aligned benchmark derived from an existing verification suite, refined through an updated evaluation protocol. When evaluated using our proposed pipeline, a compact 4B-parameter model attained a 69.2% pass rate and 90.4% average coverage on the CVDP-ECov benchmark. These results surpass its teacher model by 5.3% and 10.5%, respectively, showcasing competitive performance despite being an order of magnitude smaller than comparable models.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC