Cascade word embedding to sentence embedding: A class label enhanced approach to phenotype extraction | IEEE Conference Publication | IEEE Xplore

Cascade word embedding to sentence embedding: A class label enhanced approach to phenotype extraction


Abstract:

In molecular biology, phenotypes are often described using complex semantics and diverse biomedical expressions, thereby facilitating the development of named entity reco...Show More

Abstract:

In molecular biology, phenotypes are often described using complex semantics and diverse biomedical expressions, thereby facilitating the development of named entity recognition (NER). Here, we propose a novel approach of recognizing plant phenotypes by cascading word embedding to sentence embedding with a class label enhancement. We utilized a word embedding method to find high-frequency phenotypes with original sentences used as input in a sentence embedding method. Using this cascaded approach, we identified author-specific phenotypic expressions. In addition, we integrated a negative class label enhanced (NCLE) algorithm into our method to further optimize the training model of Sen2Vec. We used 56,748 PubMed abstracts of model organism Arabidopsis thaliana to test the effectiveness of our approach, which results in a 135% increase in the number of new phenotypic descriptions compared with the original phenotype ontology.
Date of Conference: 13-16 November 2017
Date Added to IEEE Xplore: 18 December 2017
ISBN Information:
Conference Location: Kansas City, MO, USA

Contact IEEE to Subscribe

References

References is not available for this document.