Abstract:
In this work, we explore the effectiveness of log-Mel spectrogram and MFCC features for Alzheimer's dementia (AD) recognition on ADReSS challenge dataset. We use three di...Show MoreMetadata
Abstract:
In this work, we explore the effectiveness of log-Mel spectrogram and MFCC features for Alzheimer's dementia (AD) recognition on ADReSS challenge dataset. We use three different deep neural networks (DNN) for AD recognition and mini-mental state examination (MMSE) score prediction: (i) convolutional neural network followed by a long-short term memory network (CNN-LSTM), (ii) pre-trained ResNet18 network followed by LSTM (ResNet-LSTM), and (iii) pyramidal bidirectional LSTM followed by a CNN (pBLSTM-CNN). CNN-LSTM achieves an accuracy of 64.58% with MFCC features and ResNet-LSTM achieves an accuracy of 62.5% using log-Mel spectrograms. pBLSTM-CNN and ResNet-LSTM models achieve root mean square errors (RMSE) of 5.9 and 5.98 in the MMSE score prediction, using the log-Mel spectrograms. Our results beat the baseline accuracy (62.5%) and RMSE (6.14) reported for acoustic features on ADReSS challenge dataset. The results suggest that log-Mel spectrograms and MFCCs are effective features for AD recognition problem when used with DNN models.
Published in: 2021 IEEE Spoken Language Technology Workshop (SLT)
Date of Conference: 19-22 January 2021
Date Added to IEEE Xplore: 25 March 2021
ISBN Information:
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, India
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, India
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, India
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, India
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, India
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, India