From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization
Title: Bridging the Global and Local: Context-Aware Graph Representations for Document Summarization and Classification
Abstract:
Current natural language processing (NLP) models typically encode documents as linear sequences of tokens. While this approach preserves sequential order, it often struggles to model long-range dependencies and the overarching structure of a document, a limitation that is particularly pronounced in lengthy texts. To address this, we introduce a data-driven technique for the automatic generation of graph-based document representations. Drawing on the foundational work of Bugue and de Melo (2025), our method utilizes a dynamic sliding-window attention mechanism to effectively capture both local and mid-range semantic connections between sentences, as well as structural relationships inherent within the documents. Experiments demonstrate that Graph Attention Networks (GATs) trained on these learned graphs deliver competitive performance in document classification tasks, all while consuming fewer computational resources than prior methods. Additionally, we provide an exploratory assessment of this graph construction approach for extractive document summarization, outlining both its promising capabilities and existing constraints. The codebase for this project is publicly available on GitHub.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





