FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences
Title: FLIPS: Leveraging Pseudo-random Sequences for LLM Instance Fingerprinting
Abstract:
Existing research indicates that a Large Language Model’s (LLM) performance is influenced not just by its core weights, but also by instance-specific parameters, including quantization settings, sampling configurations, and instructional prompts. Consequently, an LLM that yields safe responses under one set of conditions may generate harmful or toxic content under a different configuration. While current LLM identification methods, such as fingerprinting, are primarily designed for intellectual property protection and prioritize robustness against variations in these instance-level parameters, this approach presents a significant hurdle for AI regulation. Regulatory compliance assessments must evaluate the actual behavior of deployed models rather than merely tracing model provenance.
To address this gap, this paper proposes a regulator-focused paradigm known as instance-level fingerprinting, which is capable of distinguishing between different configurations of the same LLM. We introduce FLIPS, a method that capitalizes on biases present in generated binary random sequences. When tested across 237 model instances, FLIPS achieved an identification accuracy of 96% in closed-set scenarios and 90% in open-set scenarios (where certain targets are unknown). In contrast, the adapted LLMmap baseline performed at only 35% accuracy. These results demonstrate that instance-level fingerprinting is not only essential for regulatory purposes but also practically viable. The source code is available at https://github.com/GurvanR/FLIPS-LLM-Instance-Fingerprinting.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



