Extracting High Level Semantics by Means of Speech, Audio, and Image Primitives in Surveillance Applications | IEEE Conference Publication | IEEE Xplore