A syntactic approach for processing mathematical expressions inprinted documents
Garain, U.; Chaudhuri, B.B.
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Volume 4, Issue , 2000 Page(s):523 - 526 vol.4
Digital Object Identifier 10.1109/ICPR.2000.902972
Summary:We propose an approach for understanding mathematical expressions
in printed documents. The overall approach is divided into three main
steps: (i) detection of mathematical expressions in a document, (ii)
recognition of the symbols present in the expression and (iii)
arrangement of the recognized symbols. The detection of mathematical
expressions is done through recognition of a few most common symbols and
exploiting some structural features of the expressions. A hybrid of
feature based and a template-based technique is used for the recognition
of symbols. A two-pass approach is used for arrangement of the symbols.
The first pass (scanning or lexical analysis) performs a micro-level
examination of the symbols in order to identify the symbol groups
occurring in them and to determine their categories or descriptors. The
second pass (parsing or syntax analysis) processes the descriptors
synthesized in the first pass, to determine the syntactic structure of
the expression. A set of predefined rules guides the activities in both
the passes. Experiments conducted using this approach on a large number
of documents show high accuracy
View citation and abstract |