We propose a system that employs low-level image segmentation followed by color and two-dimensional (2-D) shape matching to automatically group those low-level segments into objects based on their similarity to a set of example object templates presented by the user. A hierarchical content tree data structure is used for each database image to store matching combinations of low-level regions as objects. The system automatically initializes the content tree with only "elementary nodes" representing homogeneous low-level regions. The "learning" phase refers to labeling of combinations of low-level regions that have resulted in successful color and/or 2-D shape matches with the example template(s). These combinations are labeled as "object nodes" in the hierarchical content tree. Once learning is performed, the speed of second-time retrieval of learned objects in the database increases significantly. The learning step can be performed off-line provided that example objects are given in the form of user interest profiles. Experimental results are presented to demonstrate the effectiveness of the proposed system with hierarchical content tree representation and learning by color and 2-D shape matching on collections of car and face images.