By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 1 • Date Jan. 2013

Filter Results

Displaying Results 1 - 21 of 21
  • Table of contents

    Publication Year: 2013 , Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (235 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Publication Year: 2013 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (140 KB)  
    Freely Available from IEEE
  • Efficient Moving Object Detection for Lightweight Applications on Smart Cameras

    Publication Year: 2013 , Page(s): 1 - 14
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (16750 KB) |  | HTML iconHTML  

    Recently, the number of electronic devices with smart cameras has grown enormously. These devices require new, fast, and efficient computer vision applications that include moving object detection strategies. In this paper, a novel and high-quality strategy for real-time moving object detection by nonparametric modeling is presented. It is suitable for its application to smart cameras operating in real time in a large variety of scenarios. While the background is modeled using an innovative combination of chromaticity and gradients, reducing the influence of shadows and reflected light in the detections, the foreground model combines this information and spatial information. The application of a particle filter allows to update the spatial information and provides a priori knowledge about the areas to analyze in the following images, enabling an important reduction in the computational requirements and improving the segmentation results. The quality of the results and the achieved computational efficiency show the suitability of the proposed strategy to enable new applications and opportunities in last generation of electronic devices. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithm and Architecture Design of Human–Machine Interaction in Foreground Object Detection With Dynamic Scene

    Publication Year: 2013 , Page(s): 15 - 29
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (12373 KB) |  | HTML iconHTML  

    In the field of intelligent visual surveillance, the topic of tolerating background motions while detecting foreground motions in dynamic scene is widely explored by the recent foreground detection literatures. Applying the sophisticated background modeling method is a common solution for such dynamic background problem. However, the sophisticated background modeling method is computation intensive and involves huge memory bandwidth on data access. Realizing such approach on a multicamera surveillance system for real-time application can dramatically increase the hardware cost. This paper presents a hardware-oriented foreground detection that is based on human-machine interaction in object level (HMIiOL) scheme. The HMIiOL can vary the conditions for a moving object been regarded as a foreground object. The conditions are depending on background environment and are derived from the information from human-machine interaction. By the HMIiOL scheme, adopting a simple background modeling method can achieve well foreground detection with significant background motions. A processor based on system-on-chip design is presented for the HMIiOL-based foreground detection. The presented processor consists of accelerators to increase throughput of the computationally intensive tasks in the algorithm, and a reduced instruction set computing unit to handle the interaction task and the noncomputation-intensive tasks. Pipelining and parallelism techniques are used to increase the throughput. The detecting capability of the processor reaches HD720 at 30 Hz. The maximum throughput can be up to 32.707 Mpixels/s. Performance evaluation and comparison with existed foreground detection hardware show the improvement of our design. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast FPGA-Based Multiobject Feature Extraction

    Publication Year: 2013 , Page(s): 30 - 45
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (17568 KB) |  | HTML iconHTML  

    This paper describes a high-frame-rate (HFR) vision system that can extract locations and features of multiple objects in an image at 2000 f/s for 512 × 512 images by implementing a cell-based multiobject feature extraction algorithm as hardware logic on a field-programmable gate array-based high-speed vision platform. In the hardware implementation of the algorithm, 25 higher-order local autocorrelation features of 1024 objects in an image can be simultaneously extracted for multiobject recognition by dividing the image into 8 × 8 cells concurrently with calculation of the zeroth and first-order moments to obtain the sizes and locations of multiple objects. Our developed HFR multiobject extraction system was verified by performing several experiments: tracking for multiple objects rotating at 16 r/s, recognition for multiple patterns projected at 1000 f/s, and recognition for human gestures with quick finger motion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An ICA Mixture Hidden Conditional Random Field Model for Video Event Classification

    Publication Year: 2013 , Page(s): 46 - 59
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (8280 KB) |  | HTML iconHTML  

    In this paper, a hidden conditional random field (HCRF) model with independent component analysis (ICA) mixture feature functions is developed for video event classification. Video content analysis problems can be modeled using graphical models. The hidden Markov model (HMM) is a commonly used graphical model, but the HMM has several limitations such as the assumption of observation independence, the form of observation distribution and the Markov chain interaction. Unlike the HMM, the HCRF is a discriminative model without conditional independence assumption of observations, and is more suitable for video content analysis. We formulate the video content analysis problem using a new HCRF framework based on the temporal interactions between video frames. In addition, according to the non-Gaussian property of video event features, a new feature function using the likelihoods of ICA mixture components is proposed for local observation to further enhance the HCRF model. The discriminative power of the HCRF and representation power of the ICA mixture for non-Gaussian distributions are combined in the new model. The new model is applied to the challenging bowling and golf event classifications as case studies. The simulation results support the analysis that the new ICA mixture HCRF (ICAMHCRF) outperforms the existing mixture HMM models in terms of classification accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved Census Transforms for Resource-Optimized Stereo Vision

    Publication Year: 2013 , Page(s): 60 - 73
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (11468 KB) |  | HTML iconHTML  

    Real-time stereo vision has proven to be a useful technology with many applications. However, the computationally intensive nature of stereo vision algorithms makes real-time implementation difficult in resource-limited systems. The field-programmable gate array (FPGA) has proven to be very useful in the implementation of local stereo methods, yet the resource requirements can still be a significant challenge. This paper proposes a variety of sparse census transforms that dramatically reduce the resource requirements of census-based stereo systems while maintaining stereo correlation accuracy. This paper also proposes and analyzes a new class of census-like transforms, called the generalized census transforms. This new transform allows a variety of very sparse census-like stereo correlation algorithms to be implemented while demonstrating increased robustness and flexibility. The resource savings and performance of these transforms is demonstrated by the design and implementation of a parameterizable stereo system that can implement stereo correlation using any census transform. Several optimizations for typical FPGA-based correlation systems are also proposed. The resulting system is capable of running at over 500 MHz on a modern FPGA, resulting in a throughput of over 500 million input pixel pairs per second. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intra-and-Inter-Constraint-Based Video Enhancement Based on Piecewise Tone Mapping

    Publication Year: 2013 , Page(s): 74 - 82
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7179 KB) |  | HTML iconHTML  

    Video enhancement plays an important role in various video applications. In this paper, we propose a new intra-and-inter-constraint-based video enhancement approach aiming to: 1) achieve high intraframe quality of the entire picture where multiple regions-of-interest (ROIs) can be adaptively and simultaneously enhanced, and 2) guarantee the interframe quality consistencies among video frames. We first analyze features from different ROIs and create a piecewise tone mapping curve for the entire frame such that the intraframe quality can be enhanced. We further introduce new interframe constraints to improve the temporal quality consistency. Experimental results show that the proposed algorithm obviously outperforms the state-of-the-art algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved Foreground Detection via Block-Based Classifier Cascade With Probabilistic Decision Integration

    Publication Year: 2013 , Page(s): 83 - 93
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7276 KB) |  | HTML iconHTML  

    Background subtraction is a fundamental low-level processing task in numerous computer vision applications. The vast majority of algorithms process images on a pixel-by-pixel basis, where an independent decision is made for each pixel. A general limitation of such processing is that rich contextual information is not taken into account. We propose a block-based method capable of dealing with noise, illumination variations, and dynamic backgrounds, while still obtaining smooth contours of foreground objects. Specifically, image sequences are analyzed on an overlapping block-by-block basis. A low-dimensional texture descriptor obtained from each block is passed through an adaptive classifier cascade, where each stage handles a distinct problem. A probabilistic foreground mask generation approach then exploits block overlaps to integrate interim block-level decisions into final pixel-level foreground segmentation. Unlike many pixel-based methods, ad-hoc postprocessing of foreground masks is not required. Experiments on the difficult Wallflower and I2R datasets show that the proposed approach obtains on average better results (both qualitatively and quantitatively) than several prominent methods. We furthermore propose the use of tracking performance as an unbiased approach for assessing the practical usefulness of foreground segmentation methods, and show that the proposed approach leads to considerable improvements in tracking accuracy on the CAVIAR dataset. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-Time Stereo Matching on CUDA Using an Iterative Refinement Method for Adaptive Support-Weight Correspondences

    Publication Year: 2013 , Page(s): 94 - 104
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (8672 KB) |  | HTML iconHTML  

    High-quality real-time stereo matching has the potential to enable various computer vision applications including semi-automated robotic surgery, teleimmersion, and 3-D video surveillance. A novel real-time stereo matching method is presented that uses a two-pass approximation of adaptive support-weight aggregation, and a low-complexity iterative disparity refinement technique. Through an evaluation of computationally efficient approaches to adaptive support-weight cost aggregation, it is shown that the two-pass method produces an accurate approximation of the support weights while greatly reducing the complexity of aggregation. The refinement technique, constructed using a probabilistic framework, incorporates an additive term into matching cost minimization and facilitates iterative processing to improve the accuracy of the disparity map. This method has been implemented on massively parallel high-performance graphics hardware using the Compute Unified Device Architecture computing engine. Results show that the proposed method is the most accurate among all of the real-time stereo matching methods listed on the Middlebury stereo benchmark. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Bayesian Approach on People Localization in Multicamera Systems

    Publication Year: 2013 , Page(s): 105 - 115
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (9229 KB) |  | HTML iconHTML  

    In this paper, we introduce a Bayesian approach on multiple people localization in multicamera systems. First, pixel-level features are extracted, which are based on physical properties of the 2-D image formation process, and provide information about the head and leg positions of the pedestrians, distinguishing standing and walking people, respectively. Then, features from the multiple camera views are fused to create evidence for the location and height of people in the ground plane. This evidence accurately estimates the leg position even if either the area of interest is only a part of the scene or the overlap ratio of the silhouettes from irrelevant outside motions with the monitored area is significant. Using this information, we create a 3-D object configuration model in the real world. We also utilize a prior geometrical constraint, which describes the possible interactions between two pedestrians. To approximate the position of the people, we use a population of 3-D cylinder objects, which is realized by a marked point process. The final configuration results are obtained by an iterative stochastic energy optimization algorithm. The proposed approach is evaluated on two publicly available datasets, and compared to a recent state-of-the-art technique. To obtain relevant quantitative test results, a 3-D ground truth annotation of the real pedestrian locations is prepared, while two different error metrics and various parameter settings are proposed and evaluated showing the advantages of our proposed model. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Motion-Compensated Scalable Video Transmission Over MIMO Wireless Channels

    Publication Year: 2013 , Page(s): 116 - 127
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (14627 KB) |  | HTML iconHTML  

    We study motion compensated fine granular scalable (MC-FGS) video transmission over multiinput multioutput (MIMO) wireless channels applicable to video streaming, where leaky and partial prediction schemes are applied in the enhancement layer of MC-FGS to exploit the tradeoff between error propagation and coding efficiency. For reliable transmission, we propose unequal error protection (UEP) by considering a tradeoff between reliability and data rates, which are controlled by forward error correction and MIMO mode selection to minimize the average distortion. In a high Doppler environment, where it is hard to get an accurate channel estimate, we investigate the performance of the proposed MC-FGS video transmission scheme with joint control of both the leaky and partial prediction parameters, and the UEP. In a slow fading channel, where the channel throughput can be estimated at the transmitter, adaptive control of prediction parameters is considered. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Color Video Denoising Based on Combined Interframe and Intercolor Prediction

    Publication Year: 2013 , Page(s): 128 - 141
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (8191 KB) |  | HTML iconHTML  

    An advanced color video denoising scheme which we call CIFIC based on combined interframe and intercolor prediction is proposed in this paper. CIFIC performs the denoising filtering in the RGB color space, and exploits both the interframe and intercolor correlation in color video signal directly by forming multiple predictors for each color component using all three color components in the current frame as well as the motion-compensated neighboring reference frames. The temporal correspondence is established through the joint-RGB motion estimation (ME) which acquires a single motion trajectory for the red, green, and blue components. Then the current noisy observation as well as the interframe and intercolor predictors are combined by a linear minimum mean squared error (LMMSE) filter to obtain the denoised estimate for every color component. The ill condition in the weight determination of the LMMSE filter is detected and remedied by gradually removing the “least contributing” predictor. Furthermore, our previous work on the LMMSE filter applied in the adaptive luminance-chrominance space (LAYUV for short) is revisited. By reformulating LAYUV and comparing it with CIFIC, we deduce that LAYUV is a restricted version of CIFIC, and thus CIFIC can theoretically achieve lower denoising error. Experimental results verify the improvement brought by the joint-RGB ME and the integration of the intercolor prediction, as well as the superiority of CIFIC over LAYUV. Meanwhile, when compared with other state-of-the-art algorithms, CIFIC provides competitive performance both in terms of the color peak signal-to-noise ratio and in perceptual quality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Deformable 3-D Facial Expression Model for Dynamic Human Emotional State Recognition

    Publication Year: 2013 , Page(s): 142 - 157
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (16894 KB) |  | HTML iconHTML  

    Automatic emotion recognition from facial expression is one of the most intensively researched topics in affective computing and human-computer interaction. However, it is well known that due to the lack of 3-D feature and dynamic analysis the functional aspect of affective computing is insufficient for natural interaction. In this paper, we present an automatic emotion recognition approach from video sequences based on a fiducial point controlled 3-D facial model. The facial region is first detected with local normalization in the input frames. The 26 fiducial points are then located on the facial region and tracked through the video sequences by multiple particle filters. Depending on the displacement of the fiducial points, they may be used as landmarked control points to synthesize the input emotional expressions on a generic mesh model. As a physics-based transformation, elastic body spline technology is introduced to the facial mesh to generate a smooth warp that reflects the control point correspondences. This also extracts the deformation feature from the realistic emotional expressions. Discriminative Isomap-based classification is used to embed the deformation feature into a low dimensional manifold that spans in an expression space with one neutral and six emotion class centers. The final decision is made by computing the nearest class center of the feature space. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fusion of Global and Local Motion Estimation for Distributed Video Coding

    Publication Year: 2013 , Page(s): 158 - 172
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (20052 KB) |  | HTML iconHTML  

    The quality of side information plays a key role in distributed video coding. In this paper, we propose a new approach that consists of combining global and local motion compensation at the decoder side. The parameters of the global motion are estimated at the encoder using scale invariant feature transform features. Those estimated parameters are sent to the decoder in order to generate a globally motion compensated side information. Conversely, a locally motion compensated side information is generated at the decoder based on motion-compensated temporal interpolation of neighboring reference frames. Moreover, an improved fusion of global and local side information during the decoding process is achieved using the partially decoded Wyner-Ziv frame and decoded reference frames. The proposed technique improves significantly the quality of the side information, especially for sequences containing high global motion. Experimental results show that, as far as the rate-distortion performance is concerned, the proposed approach can achieve a PSNR improvement of up to 1.9 dB for a Group of Pictures (GOP) size of 2, and up to 4.65 dB for larger GOP sizes, with respect to the reference DISCOVER codec. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mixed Chroma Sampling-Rate High Efficiency Video Coding for Full-Chroma Screen Content

    Publication Year: 2013 , Page(s): 173 - 185
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (27328 KB) |  | HTML iconHTML  

    Computer screens contain discontinuous-tone content and continuous-tone content. Thus, the most effective way for screen content coding (SCC) is to use two essentially different coders: a dictionary-entropy coder and a traditional hybrid coder. Although screen content is originally in a full-chroma (e.g., YUV444) format, the current method of compression is to first subsample chroma of pictures and then compress pictures using a chroma-subsampled (e.g., YUV420) coder. Using two chroma-subsampled coders cannot achieve high-quality SCC, but using two full-chroma coders is overkill and inefficient for SCC. To solve the dilemma, this paper proposes a mixed chroma sampling-rate approach for SCC. An original full-chroma input macroblock (coding unit) or its prediction residual is chroma-subsampled. One full-chroma base coder and one chroma-subsampled base coder are used simultaneously to code the original and the chroma-subsampled macroblock, respectively. The coder minimizing rate-distortion (R-D) is selected as the final coder for the macroblock. The two base coders are coherently unified and optimized to get the best overall coding performance and share coding components and resources as much as possible. The approach achieves very high visual quality with minimal computing complexity increment for SCC, and has better R-D performance than two full-chroma coders approach, especially in low bitrate. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Open Access [advertisement]

    Publication Year: 2013 , Page(s): 186
    Save to Project icon | Request Permissions | PDF file iconPDF (1156 KB)  
    Freely Available from IEEE
  • IEEE Xplore Digital Library

    Publication Year: 2013 , Page(s): 187
    Save to Project icon | Request Permissions | PDF file iconPDF (1372 KB)  
    Freely Available from IEEE
  • IEEE Foundation

    Publication Year: 2013 , Page(s): 188
    Save to Project icon | Request Permissions | PDF file iconPDF (320 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Publication Year: 2013 , Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (32 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology information for authors

    Publication Year: 2013 , Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (107 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it