Abstract:
Early depression detection research employs machine learning models trained on crowd-sourcing data. The training data easily suffer from label noise due to weak self-perc...Show MoreMetadata
Abstract:
Early depression detection research employs machine learning models trained on crowd-sourcing data. The training data easily suffer from label noise due to weak self-perception of people and uncontrollability of collection process. The noise issue is seldom discussed in the previous depression detection work. In this work, we firstly introduce the influence-based relabeling method in the depression detection task to revise noise labels, and further move one step forward to propose a threshold ratio function to control the relabeling sample size. The relabeling sample size is usually ignored in the previous influence function, so that the relabeling is sometimes overwhelming, leading to great change on the distribution of the training data, and model performance decline. Our proposed method aims at avoiding giant change on the training data. To achieve this, we design an adjustable ratio threshold for the samples to be relabeled. The ratio is adjusted according to the trained model performance. If the model has good performance on the validation set, the relabeling ratio tends mild, otherwise, the relabeling can be aggressive. In the experiments, we recruited 205 participants and collect the usage data from smartphones and wearable bands, including participants’ response to the questionnaire Depression Anxiety Stress Scale-21. We discuss several main stream denoising methods and compare four most recent methods in the depression detection task. The proposed model achieves a best testing F1 score of 86.3%.
Date of Conference: 09-12 October 2022
Date Added to IEEE Xplore: 18 November 2022
ISBN Information: