Abstract:
The development of deep learning has made it possible to automatically generate sentences that could be misinterpreted as being written by humans in the field of natural ...Show MoreMetadata
Abstract:
The development of deep learning has made it possible to automatically generate sentences that could be misinterpreted as being written by humans in the field of natural language processing. As a result, the importance of the identity of the author of the text is beginning to be emphasized. In this paper, we propose a method to evaluate the consistency of sentences, which can distinguish between "sentences composed entirely of human-written texts" and "sentences with a mixture of human-written and machine-generated texts". In addition, we tested the consistency of the proposed method in an experiment, and confirmed that it was possible to discriminate two kinds of sentences in a mixed dataset of human written text and mixed text with higher accuracy than existing works. Furthermore, Kendall's rank correlation coefficient and Mann-Whitney's U-test in the sentence discrimination experiment confirmed that the proposed method showed a significant difference between the two types of sentences with a stronger correlation.
Published in: 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
Date of Conference: 19-20 February 2021
Date Added to IEEE Xplore: 12 April 2021
ISBN Information: