Abstract:
Handwritten mathematical expression recognition (HMER) is an essential task in the OCR community, which consists of two sub-tasks, i.e., symbol recognition and structure ...Show MoreMetadata
Abstract:
Handwritten mathematical expression recognition (HMER) is an essential task in the OCR community, which consists of two sub-tasks, i.e., symbol recognition and structure parsing. Modern literature treats HMER as a LaTeX sequence predicting problem that simultaneously recognizes symbols and parses the structures of MEs. Although deep learning-based HMER methods have been achieving promising results on public benchmarks, it is admitted that the misclassification error between visually similar symbols still prevents these approaches from more generalized scenes. In this paper, we try to solve this issue from three aspects. 1) We enhanced the feature extraction progress by introducing path signature features, which incorporates local writing details and global spatial information. 2) We developed a language model that uses contextual information to correct the symbols misclassified by vision-only-based recognition models. 3) We solved the misalignment problem in existing ensemble method by designing a dynamic time warping (DTW) based algorithm. By combining the above improvements, our method achieved state-of-the-art results on three CROHME benchmarks, outperforming previous methods by a large margin.
Published in: IEEE Transactions on Multimedia ( Volume: 26)