Loading [MathJax]/extensions/MathMenu.js
Advancing Face Parsing in Real-World: Synergizing Self-Attention and Self-Distillation | IEEE Journals & Magazine | IEEE Xplore

Advancing Face Parsing in Real-World: Synergizing Self-Attention and Self-Distillation


An overview of our proposed model architecture.

Abstract:

Face parsing, the segmentation of facial components at the pixel level, is pivotal for comprehensive facial analysis. However, previous studies encountered challenges, sh...Show More

Abstract:

Face parsing, the segmentation of facial components at the pixel level, is pivotal for comprehensive facial analysis. However, previous studies encountered challenges, showing reduced performance in areas with small or thin classes like necklaces and earrings, and struggling to adapt to occlusion scenarios such as masks, glasses, caps or hands. To address these issues, this study proposes a robust face parsing technique through the strategic integration of self-attention and self-distillation methods. The self-attention module enhances contextual information, enabling precise feature identification for each facial element. Multi-task learning for edge detection, coupled with a specialized loss function focusing on edge regions, elevates the understanding of fine structures and contours. Additionally, the application of self-distillation for fine-tuning proves highly efficient, producing refined parsing results while maintaining high performance in scenarios with limited labels and ensuring robust generalization. The integration of self-attention and self-distillation techniques addresses challenges of previous studies, particularly in handling small or thin classes. This strategic fusion enhances overall performance, achieving computational efficiency, and aligns with the latest trends in this research area. The proposed approach attains a Mean F1 score of 88.18% on the CelebAMask-HQ dataset, marking a significant advancement in face parsing with state-of-the-art performance. Even in challenging occlusion areas like hands and masks, it demonstrates a remarkable F1 score of over 99%, showcasing robust face parsing capabilities in real-world environments.
An overview of our proposed model architecture.
Published in: IEEE Access ( Volume: 12)
Page(s): 29812 - 29823
Date of Publication: 21 February 2024
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.