Loading [MathJax]/extensions/MathMenu.js
Enhancing Recall Using Data Cleaning for Biomedical Big Data | IEEE Conference Publication | IEEE Xplore

Enhancing Recall Using Data Cleaning for Biomedical Big Data


Abstract:

In clinical practice, large amounts of heterogeneous medical data are generated on a daily basis. This data has the potential to be used for biomedical research and as a ...Show More

Abstract:

In clinical practice, large amounts of heterogeneous medical data are generated on a daily basis. This data has the potential to be used for biomedical research and as a diagnostic reference for physicians. However, leveraging heterogeneous data for analysis requires integrating it first. Integration process includes a pre-processing data cleaning phase that eliminates inconsistencies and errors originating from each data source. In this paper, we describe a workflow for cleaning heterogeneous biomedical data sources. Our novel data cleaning approach can be applied for replacement of missing text and to improve the number of relevant cases retrieved by search queries. When the threshold for missing category replacement is met, our results show that our method achieves a missing content replacement precision of 85%, which represents an improvement of 18% over the baseline state of our datasets.
Date of Conference: 28-30 July 2020
Date Added to IEEE Xplore: 01 September 2020
ISBN Information:

ISSN Information:

Conference Location: Rochester, MN, USA

Contact IEEE to Subscribe

References

References is not available for this document.