Benchmarking Large Language Models for Cryptanalysis and Side-Channel Vulnerabilities
Title: Evaluating Large Language Models for Cryptanalysis and Side-Channel Weaknesses
Abstract:
While recent progress in large language models (LLMs) has revolutionized natural language comprehension and generation—prompting widespread evaluation across various applications—their role in cryptanalysis remains largely unexamined. This area is crucial for data security and offers significant insights into the generalization capabilities of LLMs. To bridge this gap, our study assesses the cryptanalytic performance of leading LLMs using ciphertexts generated by a variety of cryptographic algorithms. We present a novel benchmark dataset comprising diverse plaintexts, which vary in domain, length, stylistic features, and subject matter, alongside their corresponding encrypted forms. By employing zero-shot and few-shot learning paradigms, as well as chain-of-thought prompting techniques, we measure the decryption success rates of these models and analyze their underlying comprehension skills. The results provide critical insights into the strengths and weaknesses of LLMs in side-channel contexts, highlighting concerns regarding their vulnerability to attacks stemming from under-generalization. This work underscores the dual-use implications of LLMs in security environments and adds valuable perspective to the broader discourse on AI safety and security.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





