Explicit, Implicit, and Scattered: Revisiting Event Extraction to Capture Complex Arguments

Abstract

Prior works formulate the extraction of event-specific arguments as a span extraction problem, where event arguments are explicit --- i.e. assumed to be contiguous spans of text in a document. In this study, we revisit this definition of Event Extraction (EE) by introducing two key argument types that cannot be modeled by existing EE frameworks. First, implicit arguments are event arguments which are not explicitly mentioned in the text, but can be inferred through context. Second, scattered arguments are event arguments that are composed of information scattered throughout the text. These two argument types are crucial to elicit the full breadth of information required for proper event modeling.
To support the extraction of explicit, implicit, and scattered arguments, we develop a novel dataset, DiscourseEE, which includes 7,464 argument annotations from online health discourse. Notably, 51.2% of the arguments are implicit, and 17.4% are scattered, making DiscourseEE a unique corpus for complex event extraction. Additionally, we formulate argument extraction as a text generation problem to facilitate the extraction of complex argument types. We provide a comprehensive evaluation of state-of-the-art models and highlight critical open challenges in generative event extraction.

Event Ontology and Annotation

Event ontology of DiscourseEE dataset

Example annotation in DiscourseEE.

Our DiscourseEE development pipeline.

DiscourseEE Statistics

Distribution of type-specific and subject-effect arguments in the dataset

DiscourseEE Statistics across three event types.

Explicit, implicit, and scattered arguments distribution in DiscourseEE.

Results for Event Detection and Argument Extraction Tasks

Explicit, Implicit, and Scattered: Revisiting Event Extraction to Capture Complex Arguments

Abstract

Event Ontology and Annotation

Event ontology of DiscourseEE dataset

Example annotation in DiscourseEE.

Our DiscourseEE development pipeline.

DiscourseEE Statistics

Distribution of type-specific and subject-effect arguments in the dataset

DiscourseEE Statistics across three event types.

Explicit, implicit, and scattered arguments distribution in DiscourseEE.

Results for Event Detection and Argument Extraction Tasks

Performance comparison (avg. of 3 runs) of the models for event detection.

Performance (avg. of 3 runs) of the models for event argument extraction across all argument types in relaxed match F1-score.

Performance (avg. of 3 runs) of the models for event argument extraction across all argument types in exact match F1-score.

Performance comparison of explicit, implicit, and scattered argument extraction in relaxed match settings.

Performance comparison of explicit, implicit, and scattered argument extraction in exact match settings.

Poster

BibTeX