Hybrid Neural Ordinary Differential Equations for Data-Efficient Polymerization Modeling with Incomplete Kinetics
Title: Leveraging Hybrid Neural Ordinary Differential Equations for Efficient Polymerization Modeling Amidst Kinetic Uncertainty
Abstract:
Precise forecasting of polymerization behavior is a cornerstone for effective process design, control strategies, and optimization. However, traditional approaches face significant hurdles: mechanistic models necessitate time-consuming parameterization of kinetics that are only partially understood, whereas purely data-driven methods rely on extensive, high-quality datasets that are often prohibitively expensive to acquire, especially during the initial phases of design. To address these challenges, we introduce a hybrid Neural Ordinary Differential Equation (NODE) framework designed for data-efficient modeling of free-radical polymerization processes.
Utilizing the batch polymerization of methyl methacrylate (MMA) as a primary case study, our approach explicitly retains mechanistic mass balances. Within this structure, a neural network surrogate learns only the partially characterized effective radical concentration that drives monomer consumption, while well-established reactions—including initiator decomposition, propagation, and termination—are modeled physically.
We benchmarked the hybrid NODE against a purely data-driven NODE and a discrete-time feedforward neural network under conditions of sparse data, training all models on datasets containing as few as ten measurements, regardless of whether the sampling was regular or irregular. The hybrid NODE consistently outperformed the purely data-driven baselines, yielding lower prediction errors and generating extrapolations that are more physically consistent.
In a generalization test involving noisy data and previously unseen operating conditions, the hybrid NODE achieved a Root Mean Square Error (RMSE) of 0.013. This stands in stark contrast to the RMSE values of 0.31 for the data-driven NODE and 0.68 for the discrete-time model. These results demonstrate that isolating the learning process to a single closure term, rather than attempting to capture full dynamics, is sufficient to ensure reliable predictions even when data availability is severely limited.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





