Abstract:
Structured data, typically, is predefined data. Semi-structured and unstructured data are not predefined data that includes documents, emails, social media posts, images,...Show MoreMetadata
Abstract:
Structured data, typically, is predefined data. Semi-structured and unstructured data are not predefined data that includes documents, emails, social media posts, images, videos, etc. In this research, a novel process is presented to extract structured data from emails about a domain such as on a project or product. This process consists of three phases: data cleaning, data extraction, and data consolidation. Data cleaning is done by validating the format for each email. Data extraction consists of keyword extraction, sentiment analysis, regular expression, entity extraction and summary extraction. Data consolidation is used to combine the extracted data to obtain structured data from emails. This will make the knowledge extraction process easy to manage and analyze. In large industries, it is better to consolidate all the emails regarding a project/product as one document using this process for later use. This solution will facilitate better decision-making.
Published in: 2017 International Conference on Networks & Advances in Computational Technologies (NetACT)
Date of Conference: 20-22 July 2017
Date Added to IEEE Xplore: 23 October 2017
ISBN Information: