This paper presents a powerful image understanding system that utilizes a semantic-syntactic (or attributed-synibolic) representation scheme in the form of attributed relational graphs (ARG's) for comprehending the global information contents of images. Nodes in the ARG represent the global image features, while the relations between those features are represented by attributed branches between their corresponding nodes. The extraction of ARG representation from images is achieved by a multilayer graph transducer scheme. This scheme is basically a rule-based system that uses a combination of model-driven and data-driven concepts in performing a hierarchical symbolic mapping of the image information content from the spatial-domain representation into a global representation. Further analysis and inter-pretation of the imagery data is performed on the extracted ARG representation. A distance measure between images is defined in terms of the distance between their respective ARG representations. The distance between two ARG's and the inexact matching of their respective components are calculated by an efficient dynamic programming technique. The system handles noise, distortion, and ambiguity in real-world images by two means, namely, through modeling and embedding them into the transducer's mapping rules, as well as through the appropriate cost of error-transformation for the inexact matching of the ARG image representation. Two illustrative experiments are presented to demonstrate some capabilities of the proposed system. Experiment I deals with locating objects in multiobject scenes, while Experiment II is concerned with target detection in SAR images.