arXiv

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

June 4, 2026 · Vaibhav Singh, Soumya Suvra Ghosal, Kapu Nirmal Joshua, Soumyabrata Pal, Sayak Ray Chowdhury · Original Source

Title: KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

Abstract:

In-context learning (ICL) has established itself as a robust framework for tailoring large language models (LLMs) to novel and data-limited tasks by leveraging a small set of handpicked, task-specific examples within the prompt. Nevertheless, constrained by the limited context window of LLMs, a critical challenge persists: identifying which examples to choose to optimize performance for a particular user query. Although nearest-neighbor techniques such as KATE are commonly employed for this selection, they exhibit significant limitations in high-dimensional embedding spaces, notably struggling with poor generalization and insufficient diversity. Addressing this example selection dilemma, our research adopts a rigorous, information-theoretic approach. We conceptualize an LLM as a linear function acting on input embeddings and recast the selection process as a query-specific optimization task: choosing a subset of exemplars from a broader pool to minimize prediction error for a given query. This strategy diverges from conventional learning-theoretic methods that prioritize generalization, focusing instead on precise prediction for individual query instances. We develop a principled surrogate objective that is approximately submodular, allowing for the application of a greedy algorithm that offers an approximation guarantee. Our methodology is further refined through two key enhancements: (i) the integration of the kernel trick to facilitate operations in high-dimensional feature spaces without the need for explicit mappings, and (ii) the inclusion of a regularizer based on optimal design to foster diversity among the chosen examples. Our empirical results reveal substantial gains over conventional retrieval techniques across various classification benchmarks, underscoring the advantages of employing structure-aware and diverse example selection for ICL in practical, label-scarce environments.

Source: arXiv Generated at: 2026-06-04 00:00:00 UTC