Phantom Transfer: Data Poisoning can Survive Data-Level Defences
Title: Phantom Transfer: Data Poisoning Can Evade Data-Level Defences
Abstract:
This study introduces "Phantom Transfer," a novel data poisoning attack designed with a critical characteristic: it remains undetectable and unremovable even when the exact method and location of the poison injection into an otherwise clean dataset are fully known. By adapting subliminal learning techniques for practical, real-world scenarios, we demonstrate that this attack is robust against variables such as the origin of the data, the specific model being trained on that data, and the ultimate objective of the attack.
Our results show that Phantom Transfer successfully bypasses 11 distinct data-level defensive measures, including a rigorous protocol where every data sample is paraphrased by a separate model. We analyze the conditions under which this attack is most effective and illustrate its potential to embed password-triggered behaviors into models while still evading detection. Ultimately, this work serves as an existence proof that maximum-aff defence strategies may be insufficient against advanced data poisoning. We recommend that future security frameworks combine these approaches with white-box methodologies and comprehensive post-training model audits.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



