Abstract:
The main contribution of the research is the neural network concept of Ukrainian-language text embedding with a two-level architecture. Basic embedding is implemented bas...Show MoreMetadata
Abstract:
The main contribution of the research is the neural network concept of Ukrainian-language text embedding with a two-level architecture. Basic embedding is implemented based on main lexical structures (Noun-Verb, Noun-Adjective, Verb-Adverb). To preserve the semantic component of complex, long sentences characteristic of the Ukrainian language, the post-embedding procedure is analytically defined. Both levels of the concept are implemented as autoencoder-type neural networks with appropriate topologies. To evaluate the complexity of the proposed conglomerate training process, a generalized loss function was formalized. Testing of the proposed concept in comparison with classic (BOW, Word2Vec) and progressive (MCB BERT) analogues showed that the author’s solution prevails over the former and lags behind the latter in the selected quality metric.
Published in: 2023 13th International Conference on Advanced Computer Information Technologies (ACIT)
Date of Conference: 21-23 September 2023
Date Added to IEEE Xplore: 17 October 2023
ISBN Information: