Abstract:
Tackling irrelevant emails have become part of every email user's activity. Emails that seem valid are received in the inbox and sometimes relevant emails are directed to...Show MoreMetadata
Abstract:
Tackling irrelevant emails have become part of every email user's activity. Emails that seem valid are received in the inbox and sometimes relevant emails are directed to spam. Another aspect of the problem is that, due to very high number of incoming emails, it is very difficult to identify the required ones easily. In this process, user wastes so much of their time, energy and efforts by sifting through irrelevant emails in which they have no interest. Sometimes users also get frustrated getting such junk emails frequently. Features like Ngram, Lemmatization, creating personalized vocabulary, and observation of patterns are used in the paper. Text Mining algorithms like Ngram and Lesk will be used to find the frequency of words like positive words, negative words and most frequent words. A WordNet library is used to find relevant words in the text. Unigram, Bigram and Trigram algorithms are incorporated to learn the writing behavior of the users by creating a user vocabulary, which contains particular words and its frequency in the incoming emails. Test emails can be used to evaluate the performance of the system. The system can start working as a fraud detection system which identifies the real sender.
Published in: 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS)
Date of Conference: 01-02 August 2017
Date Added to IEEE Xplore: 21 June 2018
ISBN Information: