Abstract:
Current research devoted to the Natural Language Processing problem of sentence segmentation from raw text. The focus was directed to the task of segmentation of auto-gen...Show MoreMetadata
Abstract:
Current research devoted to the Natural Language Processing problem of sentence segmentation from raw text. The focus was directed to the task of segmentation of auto-generated transcripts for videos that do not have any punctuation and segmentation. Two general approaches to solve the problem of sentence segmentation were proposed and experiments concluded on a comparison of results of pre-trained transformer-based models. Research on how different approach of solving problem affects results were carried out. As a result, the sequence labeling approach turned out to be the most suitable.
Published in: 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T)
Date of Conference: 06-09 October 2020
Date Added to IEEE Xplore: 02 July 2021
ISBN Information: