Loading [MathJax]/extensions/MathMenu.js
MCA2Net-HFG: Handwritten Font Generation Based on Res2Net and Multi-Spectral Channel Attention | IEEE Journals & Magazine | IEEE Xplore

MCA2Net-HFG: Handwritten Font Generation Based on Res2Net and Multi-Spectral Channel Attention


The model is divided into two parts: an encoder and a decoder. The encoder consists of a CNN and a Transformer. We use Res2Net and Multi-spectral Channel Attention as the...

Abstract:

To address the issue of insufficient feature extraction in handwritten font generation models concerning stroke position, tilt angle, and stroke thickness, which leads to...Show More

Abstract:

To address the issue of insufficient feature extraction in handwritten font generation models concerning stroke position, tilt angle, and stroke thickness, which leads to significant discrepancies in details between generated fonts and real handwritten fonts, we propose a handwritten font generation model, MCA2Net-HFG. The model is based on Residual Resolution Network(Res2Net) and Multi-spectral Channel Attention (MCA) as the main structure of the encoder. We use Res2Net as the backbone network of the Convolutional Neural Networks(CNN) encoder to capture font features at different scales, such as stroke thickness, curvature, and position, as well as other fine-grained characteristics. Additionally, incorporating Multi-spectral Channel Attention into Res2Net captures key feature information located at different frequencies within the channels, enabling the model to better learn important characteristics of fonts during the training phase. This ensures the authenticity of the generated handwritten fonts. Experimental results on the IAM dataset show that compared to the current best methods, MCA2Net-HFG reduces the Fréchet Inception Distance(FID) score by 18.60% and the Geometry Score(GS) score by 50.59%. This indicates that MCA2Net-HFG generates higher quality handwritten fonts and has greater practical value. Additionally, results on different test sets validate the model’s good generalization performance.
The model is divided into two parts: an encoder and a decoder. The encoder consists of a CNN and a Transformer. We use Res2Net and Multi-spectral Channel Attention as the...
Published in: IEEE Access ( Volume: 12)
Page(s): 168894 - 168903
Date of Publication: 17 September 2024
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.