Loading [MathJax]/extensions/MathMenu.js
Topic-Noise Models: Modeling Topic and Noise Distributions in Social Media Post Collections | IEEE Conference Publication | IEEE Xplore

Topic-Noise Models: Modeling Topic and Noise Distributions in Social Media Post Collections


Abstract:

Most topic models define a document as a mixture of topics and each topic as a mixture of words. Generally, the difference in generative topic models is how these mixture...Show More

Abstract:

Most topic models define a document as a mixture of topics and each topic as a mixture of words. Generally, the difference in generative topic models is how these mixtures of topics are generated. We propose looking at topic models in a new way, as topic-noise models. Our topic-noise model defines a document as a mixture of topics and noise. Topic Noise Discriminator (TND) estimates both the topic and noise distributions using not only the relationships between words in documents, but also the linguistic relationships found using word embeddings. This type of model is important for short, sparse social media posts that contain both random and non-random noise. We also understand that topic quality is subjective and that researchers may have preferences. Therefore, we propose a variant of our model that combines the pre-trained noise distribution from TND in an ensemble with any generative topic model to filter noise words and produce more coherent and diverse topic sets. We present this approach using Latent Dirichlet Allocation (LDA) and show that it is effective for maintaining high quality LDA topics while removing noise within them. Finally, we show the value of using a context-specific noise list generated from TND to remove noise statically, after topics have been generated by any topic model, including non-generative ones. We demonstrate the effectiveness of all three of these approaches that explicitly model context-specific noise in document collections.
Date of Conference: 07-10 December 2021
Date Added to IEEE Xplore: 24 January 2022
ISBN Information:

ISSN Information:

Conference Location: Auckland, New Zealand

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.