Skip to Main Content
This paper is concerned with noninvasive monitoring of human larynx using subject's questionnaire data. By applying random forests (RF), questionnaire data are categorized into a healthy class and several classes of disorders including: cancerous, noncancerous, diffuse, nodular, paralysis, and an overall pathological class. The most important questionnaire statements are determined using RF variable importance evaluations. To explore multidimensional data, t-Distributed Stochastic Neighbor Embedding (t-SNE) and multidimensional scaling (MDS) are applied to the RF data proximity matrix. When testing the developed tools on a set of data collected from 109 subjects, 100% classification accuracy was obtained on unseen data coming from two - healthy and pathological - classes. The accuracy of 80.7% was achieved when classifying the data into the healthy, cancerous, and noncancerous classes. The t-SNE and MDS mapping techniques facilitate data exploration aimed at identifying subjects belonging to a ”risk group”. It is expected that the developed tools will be of great help in preventive health care in laryngology.