Vehicle detection has been an important research field for years as there are a lot of valuable applications, ranging from support of traffic planners to real-time traffic management. Especially detection of cars in dense urban areas is of interest due to the high traffic volume and the limited space. In city areas many car-like objects (e.g., dormers) appear which might lead to confusion. Additionally, the inaccuracy of road databases supporting the extraction process has to be handled in a proper way. This paper describes an integrated real-time processing chain which utilizes multiple occurrence of objects in images. At least two subsequent images, data of exterior orientation, a global DEM, and a road database are used as input data. The segments of the road database are projected in the non-geocoded image using the corresponding height information from the global DEM. From amply masked road areas in both images a disparity map is calculated. This map is used to exclude elevated objects above a certain height (e.g., buildings and vegetation). Additionally, homogeneous areas are excluded by a fast region growing algorithm. Remaining parts of one input image are classified based on the `Histogram of oriented Gradients (HoG)' features. The implemented approach has been verified using image sections from two different flights and manually extracted ground truth data from the inner city of Munich. The evaluation shows a quality of up to 70 percent.