Loading [a11y]/accessibility-menu.js
The Influence of Different Stylometric Features on the Classification of Prose by Centuries | IEEE Conference Publication | IEEE Xplore
Scheduled Maintenance: On Monday, 30 June, IEEE Xplore will undergo scheduled maintenance from 1:00-2:00 PM ET (1800-1900 UTC).
On Tuesday, 1 July, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC).
During these times, there may be intermittent impact on performance. We apologize for any inconvenience.

The Influence of Different Stylometric Features on the Classification of Prose by Centuries


Abstract:

In this paper the authors compare by classification quality different types of stylometric features: low-level features that include character-based and word-based ones, ...Show More

Abstract:

In this paper the authors compare by classification quality different types of stylometric features: low-level features that include character-based and word-based ones, and high-level rhythm features. The authors classified texts into centuries with each feature type separately and their combinations applying four classifiers: Random Forest and AdaBoost meta-algorithms, a LSTM neural network, and a GRU neural network. The experiments with three text corpora in English, Russian, and French languages showed that combining rhythm features and low-level features significantly improved quality of classification by centuries. Besides, classification results allowed to compare the styles of writing in different languages from a point of view of structure of sentences.
Date of Conference: 07-09 September 2020
Date Added to IEEE Xplore: 02 October 2020
ISBN Information:
Print on Demand(PoD) ISSN: 2305-7254
Conference Location: Trento, Italy

Contact IEEE to Subscribe

References

References is not available for this document.