Journals & Magazines >IEEE Access >Volume: 12

Chinese Multilabel Short Text Classification Method Based on GAN and Pinyin Embedding

Model framework. It is mainly composed of two parts: text representation and classification.

Abstract:

With the development of the Chinese Internet, a large amount of Chinese short text data has been generated. The multilabel classification of Chinese short texts enables m...Show More

Metadata

Abstract:

With the development of the Chinese Internet, a large amount of Chinese short text data has been generated. The multilabel classification of Chinese short texts enables more effective management and analysis. However, due to the sparsity of Chinese short text features, and the fact that commonly used multilabel classification models are primarily designed and developed in English, traditional sampling methods can easily lead to poor classification results. In response to these challenges, we propose a Chinese multilabel short text classification method based on GAN and enhanced with pinyin. Firstly, we utilize BERT, augmented by pinyin embedding, as a method for text vector representation to enrich text information. Secondly, multiple hidden layers of BERT are integrated with the generators of the GAN model to comprehensively learn the feature distribution. Finally, the improved sampling method is used to help the model learn better. Experimental results show that the method proposed in this article performs better in processing Chinese multilabel short text classification tasks.

Model framework. It is mainly composed of two parts: text representation and classification.

Published in: IEEE Access ( Volume: 12)

Page(s): 83323 - 83329

Date of Publication: 11 June 2024

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2024.3412649

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.

Contents

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.

References is not available for this document.

Chinese Multilabel Short Text Classification Method Based on GAN and Pinyin Embedding

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Chinese Multilabel Short Text Classification Method Based on GAN and Pinyin Embedding

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?