ASAP: Exploiting the Satisficing Generalization Edge in Neural Combinatorial Optimization
Title: ASAP: Leveraging the Satisficing Generalization Advantage in Neural Combinatorial Optimization
Abstract:
Deep Reinforcement Learning (DRL) has gained traction as a powerful method for tackling Combinatorial Optimization (CO) challenges, including the 3D Bin Packing Problem (3D-BPP), Traveling Salesman Problem (TSP), and Vehicle Routing Problem (VRP). However, neural solvers frequently struggle with robustness when encountering distribution shifts. In this work, we identify and validate—both theoretically and empirically—the "Satisficing Generalization Edge," a phenomenon where selecting a subset of viable actions proves to be more generalizable than pinpointing a single optimal choice.
To capitalize on this insight, we introduce Adaptive Selection After Proposal (ASAP), a versatile framework that splits decision-making into two separate stages. The first stage employs a proposal policy that serves as a resilient filter, while the second utilizes a selection policy that acts as a flexible decision-maker. This structure facilitates a highly efficient online adaptation mechanism, allowing the selection policy to be quickly fine-tuned to new data distributions. Specifically, we present a two-phase training approach integrated with Model-Agnostic Meta-Learning (MAML) to prepare the model for rapid adaptation. Our comprehensive experiments on 3D-BPP, TSP, and CVRP reveal that ASAP significantly enhances the generalization performance of current state-of-the-art baselines and delivers superior online adaptation capabilities for out-of-distribution instances.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



