The framework starts with data collection and data cleaning. In term of label aggregation, it was an elaborated task with domain experts. The cleaned data feeds into text...
Abstract:
Citizen science has emerged in many countries to contribute to the prompt resolution of individual field problems and has been shifted toward Information System (IS) rese...Show MoreMetadata
Abstract:
Citizen science has emerged in many countries to contribute to the prompt resolution of individual field problems and has been shifted toward Information System (IS) research. In the domain of IS, a citizen report mechanism has been introduced in many local governments to understand regional problems based on the public participation. The rising of social media enforces many organizations including the local governments to utilize any information from the citizens including texts. Text mining has been utilized in various types of analyses such as sentiment analysis. However, it shows many challenges when it comes to the local context. The local context of words could cause various conflation errors that highly affect the learning task such as classification methods. This study aims to propose a context-based text processing and feed the proposed approach into a machine learning framework to classify the data of citizen reports-. The context-based text preprocessing utilized statistical- and semantic-based measurements to extract the local context and elaborate domain expertise to verify the misinterpretation for further text processing such as feature extractions. Subsequently, the n-gram language models together with the Term Frequency and Inverse Document Frequency schemes were performed to build the features. The result showed that the context-based text preprocessing improved the classification performance in majority classifiers in about 3% with the combinations of n-gram features.
The framework starts with data collection and data cleaning. In term of label aggregation, it was an elaborated task with domain experts. The cleaned data feeds into text...
Published in: IEEE Access ( Volume: 10)