High-Precision APT Malware Attribution with Out-of-Scope Resilience
Title: Enhancing APT Malware Attribution Precision Through Out-of-Scope Resilience
Abstract: Promptly identifying Advanced Persistent Threat (APT) activity enables security defenders to prioritize investigations, implement appropriate countermeasures, and mitigate intrusion damage. While malware serves as a valuable source of attribution evidence, automating this process remains a significant challenge. Current methodologies generally operate as closed-set classifiers, trained and assessed on a restricted set of known APT groups. However, in real-world operational settings, these systems inevitably encounter samples from groups absent from their training data. Consequently, closed-set classifiers are compelled to map these unknown samples to known groups, resulting in unsupported and potentially deceptive attributions.
To address this, we introduce a high-precision APT malware attribution framework utilizing ranked binary classifiers with explicit abstention capabilities. Instead of relying on a single multi-class classifier, our method trains and calibrates two binary classifiers for each APT group. These classifiers are then ranked based on validation performance and applied in a sequential manner. A sample is attributed only if the classifier presents sufficient evidence; otherwise, the system abstains from making a prediction.
We assessed our approach using the APT Malware dataset, alongside a larger, combined dataset specifically constructed to stress-test out-of-scope behavior. Our results on the APT Malware dataset demonstrate superior precision compared to previously published benchmarks on the same data. In the most rigorous test scenario, where 87% of the test samples originated from 60 APT groups excluded from the training phase, the method successfully abstained on 94% of these out-of-scope samples. Furthermore, for the samples it did classify, the approach maintained a precision of 92% and a selective accuracy of 95%.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



