A Padding Method for Enhanced Encoding of Inorganic Structures with Varying Chemical Compositions
Title: A Padding Method for Enhanced Encoding of Inorganic Structures with Varying Chemical Compositions
Original: arXiv:2605.30743v2 Announce Type: replace-cross Abstract: Designing novel inorganic materials through generative models remains an important challenge for material science, driven by the complexity and diversity of inorganic structures across expansive chemical compositions and structural landscape. The vast combinatorial space of inorganic compounds demands innovative, AI-driven approaches to overcome limitations in generative accuracy and efficiency. To address this, we introduce a novel method that redefines the encoding and generation of inorganic materials by utilizing domain-specific symmetry-aware representation. Our approach not only refines the representation of intricate inorganic structures but also contributes to the field of material discovery by enhancing the precision and stability of generated candidates. Central to our methodology is a novel padding technique that exploits crystal symmetry information to enhance the encoding process. By integrating Wyckoff position length-aware padding into an encoder architecture, we achieve a more robust informed representation of inorganic materials. This symmetry-driven enhancement improves deep learning models to generate stable, previously unexplored inorganic structures with superior accuracy and computational efficiency. Furthermore, we introduce an end-to-end system that leverages the machine learning potential models to seamlessly generate novel, even those unseen in the training data, and stable inorganic materials from initial data to validated output. This pipeline integrates advanced generative models with stability analysis, marking a significant leap forward in the automated exploration and design of next-generation inorganic materials. Our method improved reconstruction accuracy 5.3% in proton conductor data, and generated 63.5% more novel stable inorganic material to baseline model on the perov-5 dataset.
Rewrite: The creation of new inorganic materials via generative artificial intelligence poses a significant hurdle for materials science, largely due to the immense variety and structural intricacy found within broad chemical spectra. Because the combinatorial possibilities of inorganic compounds are so extensive, the field requires innovative, AI-based strategies to bypass current constraints regarding efficiency and generative precision. In response, we present a new framework that transforms how inorganic materials are encoded and produced by employing symmetry-aware representations tailored to the domain. This strategy does more than just sharpen the depiction of complex crystal structures; it advances material discovery by boosting the reliability and stability of the resulting candidates. At the heart of this technique lies a distinctive padding mechanism that capitalizes on crystal symmetry data to optimize encoding. We incorporate padding that accounts for the length of Wyckoff positions into our encoder design, resulting in a significantly more robust and informed structural representation. This focus on symmetry allows deep learning systems to produce stable, hitherto unknown inorganic structures with greater accuracy and faster computational performance. Additionally, we have developed a comprehensive end-to-end framework that employs machine learning potential models to produce novel and stable inorganic compounds—from raw input to verified output—including structures that did not appear in the training sets. By combining sophisticated generative algorithms with stability checks, this workflow represents a major advancement in the automated design and exploration of future inorganic materials. Empirical results show that our approach increased reconstruction accuracy by 5.3% on proton conductor datasets and yielded 63.5% more novel, stable inorganic materials compared to baseline models when tested on the perov-5 dataset.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





