Interpreto: An Explainability Library for Transformers
Title: Interpreto: A Tool for Explaining Transformer Models
Abstract:
Interpreto is an open-source Python package designed to interpret HuggingFace language models, ranging from early BERT iterations to large language models (LLMs). The library offers two complementary method families: concept-based explanations and attribution methods. By providing a unified API for both text generation and classification tasks, Interpreto connects recent academic research with practical application. A standout feature is its comprehensive, end-to-end concept-based pipeline, which handles everything from activation extraction and concept learning to interpretation and scoring. This approach extends beyond simple feature-level attributions and distinguishes the library from many existing alternatives. For more information, visit the GitHub repository at https://github.com/FOR-sight-ai/interpreto and explore the demo site at https://for-sight-ai.github.io/interpreto-demo/.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





