This paper presents a video surveillance system in the environment of a stationary camera that can extract moving targets from a video stream in real time and classify them into predefined categories according to their spatiotemporal properties. Targets are detected by computing the pixel-wise difference between consecutive frames, and then classified with a temporally boosted classifier and ldquospatiotemporal-oriented energyrdquo analysis. We demonstrate that the proposed classifier can successfully recognize five types of objects: a person, a bicycle, a motorcycle, a vehicle, and a person with an umbrella. In addition, we process targets that do not match any of the AdaBoost-based classifier's categories by using a secondary classification module that categorizes such targets as crowds of individuals or non-crowds. We show that the above classification task can be performed effectively by analyzing a target's spatiotemporal-oriented energies, which provide a rich description of the target's spatial and dynamic features. Our experiment results demonstrate that the proposed system is extremely effective in recognizing all predefined object classes.