By Topic

Advanced Video and Signal Based Surveillance, 2003. Proceedings. IEEE Conference on

Date 22-22 July 2003

Filter Results

Displaying Results 1 - 25 of 58
  • Proceedings IEEE Conference on Advanced Video and Signal Based Surveillance. AVSS 2003

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (270 KB)  

    The following topics are dealt with: face detection; face recognition; tracking; motion analysis; object detection; event analysis/learning; change detection; feature selection/extraction; video registration; system/camera calibration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Page(s): 377 - 378
    Save to Project icon | Request Permissions | PDF file iconPDF (160 KB)  
    Freely Available from IEEE
  • Detecting, recognizing and understanding video events in surveillance video

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (164 KB)  

    Summary form only given. The Advanced Research and Development Activity (ARDA) is currently sponsoring an advanced research program called VACE or Video Analysis and Content Extraction. The VACE program will embark on a second two-year R and D phase. The focus of this phase of VACE is on moving beyond the detection, recognition and tracking of objects in video streams to the detection, recognition and understanding of the activities that the objects are engaged in. The VALE program is interested in video events in all types of video: these types include: news broadcast video, meeting/conference video, UAV motion imagery and ground reconnaissance video as well as surveillance video. This brief overview, however, concentrates on VACE's interests in and ARDA's goals/objectives for detecting, recognizing and understanding video events in surveillance video. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel approach to detect and correct highlighted face region in color image

    Page(s): 7 - 12
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (379 KB) |  | HTML iconHTML  

    Different environmental illumination has a great impact on face detection and recognition. Automatic detection and radiant correction of highlighted regions on face images is helpful to identify human faces correctly in a color image. In this paper we present a novel approach based on dichromatic reflection model to detect and remove highlights in the face region. After inspecting the distribution configuration of skin pixels in various color models, we perform the highlight analysis on a critical two-dimensional plane instead of in a three-dimensional chromatic space. This brings forth some advantages: computational complexity is reduced, a ratio between eigenvalues is proposed in a stepwise PCA to detect automatically the existence of highlighted face region and estimate the skin dichromatic reflection vectors, by which highlight in the face region can be removed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Face cataloger: multi-scale imaging for relating identity to location

    Page(s): 13 - 20
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (413 KB) |  | HTML iconHTML  

    The level of security at a facility is directly related to how well the facility can keep track of "who is where". The "who" part of this question is typically addressed through the use of face images for recognition either by a person or a computer face recognition system. The "where" part of this question can be addressed through 3D position tracking. The "who is where" problem is inherently multi-scale, wide angle views are needed for location estimation and high resolution face images for identification. A number of other people tracking challenges like activity understanding are multiscale in nature. An effective system to answer "who is where?" must acquire face images without constraining the users and must closely associate the face images with the 3D path of the person. Our solution to this problem uses computer controlled pan-tilt-zoom cameras driven by a 3D wide-baseline stereo tracking system. The pan-tilt-zoom cameras automatically acquire zoomed-in views of a person's head, while the person is in motion within the monitored space. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combined wavelet domain and temporal video denoising

    Page(s): 334 - 341
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (598 KB) |  | HTML iconHTML  

    We develop a new filter which combines spatially adaptive noise filtering in the wavelet domain and temporal filtering in the signal domain. For spatial filtering, we propose a new wavelet shrinkage method, which estimates how probable it is that a wavelet coefficient represents a "signal of interest" given its value, given the locally averaged coefficient magnitude and given the global subband statistics. The temporal filter combines a motion detector and recursive time-averaging. The results show that this combination outperforms single resolution spatio-temporal filters in terms of quantitative performance measures as well as in terms of visual quality. Even though our current implementation of the new filter does not allow real-time processing, we believe that its optimized software implementation could be used for real- or near real-time filtering. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A robust motion-estimation algorithm for multiple-target tracking at close proximity based on hexagonal partitioning

    Page(s): 107 - 112
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (473 KB) |  | HTML iconHTML  

    The paper develops a solution to the task of tracking the position and movement of large (relative to pixel and image size) bodies through a visible-light camera's field of view. Noise sources such as dynamic backgrounds, vibration, variation in appearance, and rapidly changing lighting environments contribute to the complexity. Given that a low number of targets may be present at any one time and some assumptions about the dynamic nature of the image and targets, a solution to this problem is formulated which localizes and tracks objects in the field of view. The algorithm is capable of distinguishing among multiple targets which are in close proximity to the camera and to each other. A major consideration in the development is that the implemented system should be able to process the data in real time with moderate computational power. The present area of application is that of automatically generating ridership statistics for transit agencies: to count persons getting on or off a bus. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video extraction in compressed domain

    Page(s): 321 - 326
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (341 KB) |  | HTML iconHTML  

    In this paper, we propose a video extraction algorithm directly in the compressed domain for low cost and fast content access to those compressed video data via MPEG. Extensive experiments show that such extracted images and videos not only maintain well-preserved content features, but also illustrate reasonable quality in terms of both PSNR values and visual inspection. In cases where video processing tasks do not necessarily require full resolution pixel data such as browsing, pattern recognition, and object tracking in surveillance applications, the proposed algorithm will provide superior performance in terms of computing efficiency, under the context that millions of video frames need to be accessed, yet they are stored in compressed format. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Face tracking system based on color, stereovision and elliptical shape features

    Page(s): 21 - 26
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (253 KB) |  | HTML iconHTML  

    We present a vision system that tracks a human face in 3D. We combine color and stereo cues to find likely image regions where a face may exist. A greedy search algorithm examines for a face candidate, focusing action around the position at which the face was detected in the previous time step. The aim of the search is to find the best-fit head ellipse. The size of the searched ellipse projected into the image is scaled depending on the depth information. The final position of the ellipse is determined on the basis of intensity gradient near the edge of the ellipse, depth gradient along the head boundary and matching of the color histograms representing the interior of the actual and the previous ellipse. The color histogram and parameters of the ellipse are dynamically updated over time and compared with previous ones. The frontal view face is detected using PCA to make the tracking more reliable and, in particular, to update the color model over time with only face-like skin pixels. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Photometric aspects: a new approach for 3D free form object recognition using a single luminance image

    Page(s): 131 - 136
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (648 KB) |  | HTML iconHTML  

    3D free form object recognition is one of the most difficult problems in computer vision. We present a new approach which exploits only one luminance image of a complex object to recognize it in the scene by identifying its appearance in the input image. We construct a photometric (non geometric) projective invariant to perform matching between local regions of the object in the image and those of the model. We propose an original method based on what we called "photometric aspects" to construct a discriminative data base of the 3D object model. We demonstrate the effectiveness of our approach while implementing it with complex free form objects and we present some obtained results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Human body pose estimation using silhouette shape analysis

    Page(s): 263 - 270
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (669 KB) |  | HTML iconHTML  

    We describe a system for human body pose estimation from multiple views that is fast and completely automatic. The algorithm works in the presence of multiple people by decoupling the problems of pose estimation of different people. The pose is estimated based on a likelihood function that integrates information from multiple views and thus obtains a globally optimal solution. Other characteristics that make our method more general than previous work include: (1) no manual initialization; (2) no specification of the dimensions of the 3D structure; (3) no reliance on some learned poses or patterns of activity; (4) insensitivity to edges and clutter in the background and within the foreground. The algorithm has applications in surveillance and promising results have been obtained. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive live video streaming by priority drop

    Page(s): 342 - 347
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1451 KB)  

    In this paper we explore the use of priority progress streaming (PPS) for video surveillance applications. PPS is an adaptive streaming technique for the delivery of continuous media over variable bit-rate channels. It is based on the simple idea of reordering media components within a time window into priority order before transmission. The main concern when using PPS for live video streaming is the time delay introduced by reordering. In this paper we describe how PPS can be extended to support live streaming and show that the delay inherent in the approach can be tuned to satisfy a wide range of latency constraints while supporting fine-grain adaptation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Human detection using depth and gray images

    Page(s): 115 - 121
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (620 KB) |  | HTML iconHTML  

    A method is presented for extracting pedestrian information from an image sequence taken by a monocular camera. The method makes use of hybrid sensing of depth and gray information and it is shown to work well in an indoor environment. A split-and-merge strategy is proposed to process depth data for object and human detection. Furthermore, human tracking and event detection are also presented to recognize simple behavior such as hand-shaking. This method does not use background subtraction, and therefore it is applicable for scenes taken from mobile platforms. Experimental results are presented to validate our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reliable-inference framework for recognition of human actions

    Page(s): 169 - 176
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (409 KB) |  | HTML iconHTML  

    We present an action recognition method based on the concept of reliable inference. Our approach is formulated in a probabilistic framework using posterior class ratios to verify the saliency of an input before committing to any action classification. The framework is evaluated in the context of walking, running, and standing at multiple viewpoints and compared to ML and MAP approaches. Results examining individual silhouette images with the framework demonstrate that these actions can be reliably discriminated while discounting confusing images. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A frame-level FSBM motion estimation architecture with large search range

    Page(s): 327 - 333
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (329 KB) |  | HTML iconHTML  

    Motion estimation plays an important role in video compression to remove temporal redundancies between successive frames. It is also useful in aiding detection of moving objects. Full-search block-matching (FSBM) is the most preferred algorithm for motion estimation. Frame-level pipelined FSBM architectures have advantages over block-level pipelined architectures in their simpler control and reduced number of memory accesses. In this paper, a frame-level pipelined FSBM motion estimation array processor for large search range p = qN is presented where q≥ 1/2 and N×N is the block size. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A multi-camera conical imaging system for robust 3D motion estimation, positioning and mapping from UAVs

    Page(s): 99 - 106
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (736 KB) |  | HTML iconHTML  

    Over the last decade there, has been an increasing interest in developing vision systems and technologies that support the operation of unmanned platforms for positioning, mapping, and navigation. Until very recently, these developments relied on images from standard CCD cameras with a single optical center and limited field of view, making them restrictive for some applications. Panoramic images have been explored extensively in recent years. The particular configuration of interest to our investigation yields a conical view, which is most applicable for airborne and underwater platforms. Instead of a single catadioptric camera (Gluckman, J.M. and Nayar, S.K., 1999; Swaminathan, R. et al., 2001), a combination of conventional cameras may he used to generate images at much higher resolution (Negahdaripour, S. et al., Proc. Oceans, 2001). We derive complete mathematic models of projection and image motion equations for a down-looking conical camera that may be installed on a mobile platform - e.g. an airborne or submersible system for terrain flyover imaging. We describe the calibration of a system comprising multiple cameras with overlapping fields of view to generate the conical view. We demonstrate with synthetic and real data that such images provide better accuracy in 3D visual motion estimation, which is the underlying issue in 3D positioning, navigation, mapping, image registration and photo-mosaicking. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving the extraction of temporal motion strength signals from video recordings of neonatal seizures

    Page(s): 87 - 92
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (360 KB) |  | HTML iconHTML  

    The paper presents a procedure developed to extract quantitative information from video recordings of neonatal seizures in the form of temporal motion strength signals. These signals are obtained by applying nonlinear filtering, segmentation, and morphological filtering on the differences between adjacent frames. The experiments indicate that temporal motion strength signals constitute an effective representation of videotaped clinical events and can be used for seizure recognition and characterization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A survey of camera self-calibration

    Page(s): 351 - 357
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB) |  | HTML iconHTML  

    The paper surveys the developments of the last 10 years in the area of camera self-calibration. Self-calibration is an attempt to calibrate camera by finding intrinsic parameters that are consistent with the underlying projective geometry of a sequence of images. In order to solve this problem, the camera intrinsic constraints have been used separately and in conjunction with camera motion constraints or scene constraints. Most self-calibration algorithms are concerned with unknown but constant intrinsic camera parameters. Recently, camera self-calibration in the case of varying intrinsic camera parameters was also studied. We present the basic theories behind the different self-calibration techniques and discuss the ideas behind most of the self-calibration algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Battlefields that see

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (163 KB)  

    Summary form only given. We introduce two new applications of automated video analysis being pursued by DARPA. Video Verification of Identity (VIVID) is concerned with automatically tracking moving vehicles from a moving aircraft and searching the surrounding area for others who may be nearby. Combat Zones That See (CTS) is coordinating the analysis of large numbers of fixed cameras to identify and track vehicles over extended distances. Together, these programs illustrate the transformational power that the automated analysis of video has for military reconnaissance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic face region tracking for highly accurate face recognition in unconstrained environments

    Page(s): 29 - 36
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (543 KB) |  | HTML iconHTML  

    We present a combined real-time face region tracking and highly accurate face recognition technique for an intelligent surveillance system. High-resolution face images are very important to achieving accurate identification of a human face. Conventional surveillance or security systems, however, usually provide poor image quality because they use only fixed cameras to record scenes passively. We have implemented a real-time surveillance system that tracks a moving face using four pan-tilt-zoom (PTZ) cameras. While tracking, the region-of-interest (ROI) can be obtained by using a low-pass filter and background subtraction with the PTZ. Color information in the ROI is updated to extract features for optimal tracking and zooming. FaceIt®, which is one of the most popular face recognition software packages, is evaluated and then used to recognize the faces from the video signal. Experimentation with real human faces showed highly acceptable results in the sense of both accuracy and computational efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive object identification and recognition using neural networks and surface signatures

    Page(s): 137 - 142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (649 KB) |  | HTML iconHTML  

    The paper introduces an adaptive technique for 3D object identification and recognition in 3D scanned scenes. This technique uses neural learning of the 3D free-form surface representation of the object in study. This representation scheme captures the 3D curvature information of any free-form surface and encodes it into a 2D image corresponding to a certain point on the surface. This image represents a "surface signature" because it is unique for this point and is independent of the object translation or orientation in space. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recognizing human activities

    Page(s): 157 - 162
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (304 KB) |  | HTML iconHTML  

    The paper deals with the problem of classification of human activities from video as one way of performing activity monitoring. Our approach uses motion features that are computed very efficiently and subsequently projected into a lower dimension space where matching is performed. Each action is represented as a manifold in this lower dimension space and matching is done by comparing these manifolds. To demonstrate the effectiveness of this approach, it was used on a large data set of similar actions, each performed by many different actors. Classification results are accurate and show that this approach can handle many challenges such as variations in performers' physical attributes, color of clothing, and style of motion. An important result is that the recovery of three-dimensional properties of a moving person, or even two-dimensional tracking of the person's limbs, is not a necessary step that must precede action recognition. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On registration of regions of interest (ROI) in video sequences

    Page(s): 313 - 318
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (431 KB) |  | HTML iconHTML  

    The paper addresses the problem of registering regions of interest in two video sequences. Potential applications include blob fusion and target tracking in blurry sequences. It is assumed that the moving target is tracked successfully in one of the two sequences and is represented by a bounding box in each frame of the first sequence. The goal is to find the corresponding bounding box for each frame of the second video sequence. The registration algorithm developed is based on mutual information. To facilitate the registration process, the two cameras are assumed to be calibrated such that the geometrical transformation required to register the corresponding bounding boxes is a 2D rigid body transformation without rotation. Visual and IR video sequences are used to test the proposed approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Moving cast shadow elimination for robust vehicle extraction based on 2D joint vehicle/shadow models

    Page(s): 229 - 236
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (535 KB) |  | HTML iconHTML  

    A new algorithm to eliminate moving cast shadow for robust vehicle detection and extraction in a vision-based highway monitoring system is investigated. The proposed algorithm is based on a simplified 2D vehicle/shadow model of six types projected on to a 2D image plane. Parameters of the joint 2D vehicle/shadow models can be estimated from the input video without light source and camera calibration information. Simulations are performed to verify that the proposed technique is effective for vision-based highway surveillance systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-time detection of threat

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (161 KB)  

    Summary form only given. Government agencies, personnel security professionals, and our military services are faced with new challenges to rapidly assess the credibility of statements made by individuals in airports, border crossings, secured facilities, and a variety of environments not conducive to prolonged interviews. The changing environment has become more global and threats lie not just in the securing of an environment, but the increased dependency on gathering accurate information from a variety of individuals. The most robust source available for obtaining information regarding the past and present behavior of an individual or groups of individuals may be the person of interest. Some new technologies may offer assistance in this critical area. Current advanced research projects looking at emerging technologies in operational credibility assessment are presented for discussion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.