Journals & Magazines >IEEE Access >Volume: 10

An NLP-Inspired Data Augmentation Method for Adverse Event Prediction Using an Imbalanced Healthcare Dataset

The proposed record augmentation method creates a synthetic patient record by replacing a background information in the original patient record with a similar one.

Abstract:

This paper proposes a data augmentation method for imbalanced healthcare datasets. This method was inspired by a data augmentation method in natural language processing (...Show More

Metadata

Abstract:

This paper proposes a data augmentation method for imbalanced healthcare datasets. This method was inspired by a data augmentation method in natural language processing (NLP) that generates synthetic sentences for training by replacing some words with similar words. The proposed method generates synthetic patient records by replacing patient backgrounds with similar backgrounds. In this paper, the cosine similarity of the distributed representations was used as the similarity metric between patient backgrounds. The distributed representations of the patient backgrounds were generated by the skip-gram model. To confirm the performance improvement with the proposed data augmentation method, the prediction performance of adverse events (AEs) caused by drug administration was experimentally evaluated on a real-world medical dataset with 1,510,137 records. The combination of the proposed data augmentation method and a conventional undersampling method resulted in an 80.0% improvement in accuracy and a 40.0% improvement in the precision and F1-score. The multifaceted evaluation demonstrated that the proposed method is effective, especially for predicting AEs with positive ratios ranging from 1.0% to 2.1%, which are difficult to predict with conventional machine learning methods but should be predictable in the medical field.

The proposed record augmentation method creates a synthetic patient record by replacing a background information in the original patient record with a similar one.

Published in: IEEE Access ( Volume: 10)

Page(s): 81166 - 81176

Date of Publication: 01 August 2022

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2022.3195212

Contents

References is not available for this document.

An NLP-Inspired Data Augmentation Method for Adverse Event Prediction Using an Imbalanced Healthcare Dataset

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

An NLP-Inspired Data Augmentation Method for Adverse Event Prediction Using an Imbalanced Healthcare Dataset

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?