Loading [a11y]/accessibility-menu.js
DReD–A Descriptive Relation Dataset for Expanding Relation Extraction | IEEE Journals & Magazine | IEEE Xplore

DReD–A Descriptive Relation Dataset for Expanding Relation Extraction


Impact Statement:Relation extraction is a powerful task, providing a method to extract labeled connections between words in a document. Existing datasets focus on relations between import...Show More

Abstract:

Relation extraction is a fundamental topic in document information extraction. Traditionally, datasets for relation extraction have been annotated with named entities and...Show More
Impact Statement:
Relation extraction is a powerful task, providing a method to extract labeled connections between words in a document. Existing datasets focus on relations between important named nouns, with relations sourced from a list of predefined categories. These categories create limitations for trained models, missing important context that a category name cannot capture alone. Our new Descriptive Relation Dataset, DReD, overcomes these limitations by providing a dataset that allows models to learn how to describe relations in a sentence. DReD contains 3286 annotations of descriptions of relations between general noun phrases, removing the previously stated limitations and providing a way to uncover previously unseen relation types while providing meaningful context. Furthermore, any sequence-to-sequence model can be easily trained on DReD, allowing for flexible and future-proof applications.

Abstract:

Relation extraction is a fundamental topic in document information extraction. Traditionally, datasets for relation extraction have been annotated with named entities and classified with a subset of relation categories. Models then predict either the entities and relations (end-to-end) or assume the entities are given and only classify the relations. However, current approaches are limited by datasets with a narrow definition of entities and relations. We seek to remedy this by introducing our Descriptive Relation Dataset (DReD), which contains 3286 annotations for descriptions of relations between more general noun phrases inspired by linguistic theory. We benchmark our dataset using several seq2seq models and find that T5 achieves the best results with a ROUGE-1 score of 75.5. We verify the usefulness of DreD by collecting feedback on 100 predictions and comparing human judgment to automated scoring methods. Finally, we verify that relations can be described accurately by transformin...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 4, Issue: 6, December 2023)
Page(s): 1494 - 1503
Date of Publication: 12 September 2022
Electronic ISSN: 2691-4581

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.