Skip to Main Content
Information Extraction deals with the automatic extraction of information from unstructured sources. This field has opened up new avenues for querying, organizing, and analyzing data by drawing upon the clean semantics of structured databases and the abundance of unstructured data. The text surveys over two decades of information extraction research from various communities such as computational linguistics, machine learning, databases and information retrieval. Information Extraction provides a taxonomy of the field along various dimensions derived from the nature of the extraction task, the techniques used for extraction, the variety of input resources exploited, and the type of output produced. It elaborates on rule-based and statistical methods for entity and relationship extraction. In each case it highlights the different kinds of models for capturing the diversity of clues driving the recognition process and the algorithms for training and efficiently deploying the models. It s rveys techniques for optimizing the various steps in an information extraction pipeline, adapting to dynamic data, integrating with existing entities and handling uncertainty in the extraction process. Information Extraction is an ideal reference for anyone with an interest in the fundamental concepts of this technology. It is also an invaluable resource for those researching, designing or deploying models for extraction.