Loading [MathJax]/extensions/MathMenu.js
Script Identification of Central Asia Based on Fused Texture Features | IEEE Conference Publication | IEEE Xplore

Script Identification of Central Asia Based on Fused Texture Features


Abstract:

Script identification is an important step in multi-script recognition. Despite the achieved results in this field, the identification of Central Asian scripts has not be...Show More

Abstract:

Script identification is an important step in multi-script recognition. Despite the achieved results in this field, the identification of Central Asian scripts has not been considered in-depth. In the Central Asian region, there are many similar scripts, and the traditional texture features can not discriminate them accurately. This paper proposes a script identification method based on fused texture features for Central Asian document images. On preprocessed multilingual document images, the method first performs Non-subsampled Contourlet Transform (NSCT), and then extracts Tamura texture features of the generated sub-bands. A Support Vector Machine (SVM) classifier is trained for classification. For experimental evaluation, it is collected a dataset of 30, 000 document images for 10 scripts, such as Arabic, Chinese, English, Russian, Kazakhstan, Turkish, Uyghur, Kyrgyzstan, Mongolian and Tibetan. The experimental results show that the proposed method can extract multi-scale and multi-directional texture features, and the fusion of texture features leads to superior performance of script identification.
Date of Conference: 20-24 August 2018
Date Added to IEEE Xplore: 29 November 2018
ISBN Information:
Print on Demand(PoD) ISSN: 1051-4651
Conference Location: Beijing, China

Contact IEEE to Subscribe

References

References is not available for this document.