Abstract:
Label noise model is a technique to construct controlled noisy datasets for evaluating noise-robust algorithms. However, the quality of the generated noise has not been e...Show MoreMetadata
Abstract:
Label noise model is a technique to construct controlled noisy datasets for evaluating noise-robust algorithms. However, the quality of the generated noise has not been evaluated thoroughly. In this paper, we propose a novel research question: Do the constructed datasets with the same noise rate have equal effects? We answer this question through a carefully designed experiment: We sequentially generate a series of noisy datasets with equal noise rate by excluding the previous noisy samples while controlling the clean samples. Models trained on these datasets show discrepancy generalization performance, indicating the inequality of noise. Our in-depth analysis reveals that the reasons come from (1) the introduction of non-hard samples and (2) the inequality between hard samples. We propose a primary equal-quality instance-dependent label noise model termed EQIDN, which alleviates both issues based on the identification of hard samples and stratified sampling. We compare the noise quality generated by EQ-IDN and other models. Experimental results demonstrate that the noise generated by EQ-IDN has lower difference and more stable quality.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: