Skip to Main Content
Automatic classification of figures present in healthcare documents is known to be useful for biomedical document mining. The context of a document is directly reflected in the figures present within them. Embedded text within these figures along with image features have been used for figure retrieval. We demonstrate that image features based on structural properties of figures alone is sufficient for the figure retrieval task. An algorithm for describing structural properties of the embedded images, Fourier Edge Orientation Autocorrelogram, which utilizes spatial distribution of detected edges, is presented. We have shown that Fourier Edge Orientation Autocorrelogram performs better than its predecessor, when most of the edge information is retained. The algorithm is validated on publicly available figures from healthcare literature. Apart from invariance to scale, rotation and non-uniform illumination, the proposed feature descriptor is also shown to be relatively robust to noisy edges. Since there is no standard dataset available for figure classification, comparison of the proposed feature descriptor with four well known binary shape descriptors is demonstrated. The retrieval performance shows an overall improvement over other known methods in figure retrieval task.