arXiv

DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset

June 2, 2026 · Shannan Liu, Peifeng Li, Yaxin Fan, Qiaoming Zhu · Original Source

Title: DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset

Abstract:

Multi-party dialogue discourse parsing is designed to uncover the dependency structures and relational types that exist between utterances within conversational contexts. Existing research has predominantly focused on unimodal text or two-party interactions, thereby falling short of addressing the complexities inherent in multimodal and multi-party environments. To bridge this gap, we introduce DraDDP, the inaugural publicly accessible English dataset dedicated to multimulti-party dialogue discourse parsing, derived from American television dramas. This comprehensive resource comprises 495 dialogue segments, totaling 6,374 utterances and 9.1 hours of synchronized video footage, which collectively encompass a wide array of multi-party interaction scenarios. Furthermore, we have developed robust benchmarks for this task by evaluating performance on DraDDP and performing a detailed analysis of how various modalities influence outcomes. Our findings highlight the significant contribution of multimodal data in accurately identifying dialogue structures and relation types. To foster further advancements in multimodal dialogue understanding, we intend to make the dataset, annotation guidelines, and associated code available to the public.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC