Impact Statement:Relation extraction is a powerful task, providing a method to extract labeled connections between words in a document. Existing datasets focus on relations between import...Show More
Abstract:
Relation extraction is a fundamental topic in document information extraction. Traditionally, datasets for relation extraction have been annotated with named entities and...Show MoreMetadata
Impact Statement:
Relation extraction is a powerful task, providing a method to extract labeled connections between words in a document. Existing datasets focus on relations between important named nouns, with relations sourced from a list of predefined categories. These categories create limitations for trained models, missing important context that a category name cannot capture alone. Our new Descriptive Relation Dataset, DReD, overcomes these limitations by providing a dataset that allows models to learn how to describe relations in a sentence. DReD contains 3286 annotations of descriptions of relations between general noun phrases, removing the previously stated limitations and providing a way to uncover previously unseen relation types while providing meaningful context. Furthermore, any sequence-to-sequence model can be easily trained on DReD, allowing for flexible and future-proof applications.
Abstract:
Relation extraction is a fundamental topic in document information extraction. Traditionally, datasets for relation extraction have been annotated with named entities and classified with a subset of relation categories. Models then predict either the entities and relations (end-to-end) or assume the entities are given and only classify the relations. However, current approaches are limited by datasets with a narrow definition of entities and relations. We seek to remedy this by introducing our Descriptive Relation Dataset (DReD), which contains 3286 annotations for descriptions of relations between more general noun phrases inspired by linguistic theory. We benchmark our dataset using several seq2seq models and find that T5 achieves the best results with a ROUGE-1 score of 75.5. We verify the usefulness of DreD by collecting feedback on 100 predictions and comparing human judgment to automated scoring methods. Finally, we verify that relations can be described accurately by transformin...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 4, Issue: 6, December 2023)