Abstract:
Language processing tool has been strengthened by the measurements of text classification. Due to this concern many approaches investigate through the text documentation ...Show MoreMetadata
Abstract:
Language processing tool has been strengthened by the measurements of text classification. Due to this concern many approaches investigate through the text documentation problem. Behalf of this conception we have focused on our bengali text written format. A long period of time bengali people are familiar with two bengali accents about saint and common form. With concern text document processing becomes easier to translate. Most well-known supervised six classifiers we have used to classify these two bengali forms of saint and common. Classifiers prediction will determine whether it is saint or common form. Collection of text documents more than 1200 mix sentences grabbed from bengali written sources. Each text needs preprocess to classify the text into a solid form of output. Before applying algorithms there has been some prerequisite ability to split the sentences, stemming, remove stop words, construct contraction. Ending with preprocess, the processed bengali text had been taken as input on machine learning classifiers that have raised very spontaneous outcomes over the accent of bengali data. Foremost output produced by NB classifier to identify the actual form about 77% on bengali text of saint and common form. Apart from that, other ML classifiers XGB, RB, DT, SVC, KNN showed nearly prediction upto 77% -64% accuracy which we have proposed in different segments of this paper.
Published in: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT)
Date of Conference: 01-03 July 2020
Date Added to IEEE Xplore: 15 October 2020
ISBN Information: