Symbolic Neural Generation with Applications to Lead Discovery in Drug Design
Title: Symbolic Neural Generation with Applications to Lead Discovery in Drug Design
Abstract:
This study explores a less investigated category of hybrid neurosymbolic architectures that merge symbolic learning with neural reasoning to create data generators adhering to formal correctness standards. In Symbolic Neural Generators (SNGs), symbolic learners analyze logical specifications derived from feasible data, utilizing a limited set of examplesâpotentially as few as a single instance. These specifications subsequently restrict the conditional inputs provided to a neural generator, which discards any generated samples that fail to meet the symbolic criteria. Similar to other neurosymbolic frameworks, SNG leverages the synergistic advantages of both symbolic and neural techniques. The output of an SNG consists of a pair $(H, X)$: $H$ represents a symbolic characterization of viable instances built from existing data, while $X$ denotes a collection of newly generated instances that conform to this characterization.
We propose a semantic framework for these systems, constructed by combining suitable base and fiber partially-ordered sets into a unified partial order. To demonstrate practical utility, we implemented an SNG that integrates a constrained version of Inductive Logic Programming (ILP) with a large language model (LLM), testing its capabilities in early-stage drug discovery. Our primary focus lies on the resulting symbolic descriptions and the pool of potential inhibitor molecules produced by the system. In benchmark scenarios involving well-characterized drug targets, the SNGâs performance matches state-of-the-art methods with statistical significance. In more exploratory contexts with poorly understood targets, the generated molecules demonstrated binding affinities comparable to top-tier clinical candidates. Furthermore, domain experts deemed the symbolic specifications valuable as initial filters, identifying several generated compounds as promising candidates for synthesis and subsequent wet-lab evaluation.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




