By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 2 • Date Feb. 2008

Filter Results

Displaying Results 1 - 18 of 18
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (75 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (38 KB)  
    Freely Available from IEEE
  • In-Scale Motion Compensation for Spatially Scalable Video Coding

    Page(s): 145 - 158
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2225 KB) |  | HTML iconHTML  

    In existing pyramid-based spatially scalable coding schemes, such as H.264/MPEG-4 SVC (scalable video coding), video frame at a certain high-resolution layer is mainly predicted either from the same frame at the next lower resolution layer, or from the temporal neighboring frames within the same resolution layer. But these schemes fail to exploit both kinds of correlation simultaneously and therefore cannot remove the redundancies among resolution layers efficiently. This paper extends the idea of spatiotemporal subband transform and proposes a general in-scale motion compensation technique for pyramid-based spatially scalable video coding. Video frame at each high-resolution layer is partitioned into two parts in frequency. Prediction for the lowpass part is derived from the next lower resolution layer, whereas prediction for the highpass part is obtained from neighboring frames within the same resolution layer, to further utilize temporal correlation. In this way, both kinds of correlation are exploited simultaneously and the cross-resolution-layer redundancy can be highly removed. Furthermore, this paper also proposes a macroblock-based adaptive in-scale technique for hybrid spatial and SNR scalability. Experimental results show that the proposed techniques can significantly improve the spatial scalability performance of H.264/MPEG-4 SVC, especially when the bit-rate ratio of lower resolution bit stream to higher resolution bit stream is considerable. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate-Distortion and Complexity Optimized Motion Estimation for H.264 Video Coding

    Page(s): 159 - 171
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (962 KB) |  | HTML iconHTML  

    H.264 video coding standard supports several inter- prediction coding modes that use macroblock (MB) partitions with variable block sizes. Rate-distortion (R-D) optimal selection of both the motion vectors (MVs) and the coding mode of each MB is essential for an H.264 encoder to achieve superior coding efficiency. Unfortunately, searching for optimal MVs of each possible subblock incurs a heavy computational cost. In this paper, in order to reduce the computational burden of integer-pel motion estimation (ME) without sacrificing from the coding performance, we propose a R-D and complexity joint optimization framework. Within this framework, we develop a simple method that determines for each MB which partitions are likely to be optimal. MV search is carried out for only the selected partitions, thus reducing the complexity of the ME step. The mode selection criteria is based on a measure of spatiotemporal activity within the MB. The procedure minimizes the coding loss at a given level of computational complexity either for the full video sequence or for each single frame. For the latter case, the algorithm provides a tight upper bound on the worst case complexity/execution time of the ME module. Simulation results show that the algorithm speeds up integer-pel ME by a factor of up to 40 with less than 0.2 dB loss in coding efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Fast MB Mode Decision Algorithm for MPEG-2 to H.264 P-Frame Transcoding

    Page(s): 172 - 185
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1445 KB) |  | HTML iconHTML  

    The H.264 standard achieves much higher coding efficiency than the MPEG-2 standard, due to its improved inter-and intra-prediction modes at the expense of higher computational complexity. Transcoding MPEG-2 video to H.264 is important to enable gradual migration to H.264. However, given the significant differences between the MPEG-2 and the H.264 coding algorithms, transcoding is a much more complex task and new approaches to transcoding are necessary. The main problems that need to be addressed in the design of an efficient heterogeneous MPEG-2/H.264 transcoder are: the inter-frame prediction, the transform coding and the intra-frame prediction. In this paper, we focus our attention on the inter-frame prediction, the most computationally intensive task involved in the transcoding process. This paper presents a novel macroblock (MB) mode decision algorithm for P-frame prediction based on machine learning techniques to be used as part of a very low complexity MPEG-2 to H.264 video transcoder. Since coding mode decisions take up the most resources in video transcoding, a fast MB mode estimation would lead to reduced complexity. The proposed approach is based on the hypothesis that MB coding mode decisions in H.264 video have a correlation with the distribution of the motion compensated residual in MPEG-2 video. We use machine learning tools to exploit the correlation and construct decision trees to classify the incoming MPEG-2 MBs into one of the several coding modes in H.264. The proposed approach reduces the H.264 MB mode computation process into a decision tree lookup with very low complexity. Experimental results show that the proposed approach reduces the MB mode selection complexity by as much as 95% while maintaining the coding efficiency. Finally, we conduct a comparative study with some of the most prominent fast inter-prediction methods for H.264 presented in the literature. Our results show that the proposed approach achieves the best results for video tran- - scoding applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Inter-Mode Selection in the H.264/AVC Standard Using a Hierarchical Decision Process

    Page(s): 186 - 195
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1305 KB) |  | HTML iconHTML  

    A complexity reduction algorithm tailored for the H.264/AVC encoder is described. It aims to alleviate the computational burden imposed by Lagrangian rate distortion optimization in the inter-mode selection process. The proposed algorithm is described as a hierarchical structure comprising three levels. Each level targets different types of macroblocks according to the complexity of the search process. Early termination of mode selection is triggered at any of the levels to avoid a full cycle of Lagrangian examination. The algorithm is evaluated using a wide range of test sequences of different classes. The results demonstrate a reduction in encoding time of at least 40%, regardless of the class of sequence. Despite the reduction in computational complexity, picture quality is maintained at all bit rates. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DEWS: A Live Visual Surveillance System for Early Drowning Detection at Pool

    Page(s): 196 - 210
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1627 KB) |  | HTML iconHTML  

    A real-time vision system operating at an outdoor swimming pool is presented in this paper. The system is designed to automatically recognize different swimming activities and to detect occurrence of early drowning incidents. We have named this system the Drowning Early Warning System (DEWS). One key challenge we faced in the problem is the relatively high level of noise in the steps of foreground detection and behavior recognition. Therefore, a set of methods in the fields of background subtraction, denoising, data fusion and blob splitting are proposed, which have been motivated by characteristics of aquatic background and crowded scenario at the pool. In the step to detect an early drowning incident, visual indicators of distress and drowning are incorporated through a set of foreground descriptors. A module comprising data fusion and hidden Markov modeling is developed to learn unique traits of different swimming behaviors, in particular, those early drowning events. The experiment of this work reports realistic on-site evaluations performed. Examples of interesting behaviors, i.e., distress, drowning, treading and numerous swimming styles, are simulated and collected. Experimental results show that we have established a prototype system which is robust and beyond the stage of proof-of-concept. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Group Behavior Recognition for Gesture Analysis

    Page(s): 211 - 222
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1174 KB) |  | HTML iconHTML  

    This paper analyzes the movements of the human body limbs (hands, feet and head) and center of gravity in order to detect and analyze simple actions such as walking and running. We propose a novel vision of the human body, by considering the limbs as cooperative agents that form a hierarchy of cooperative teams: the whole body. The movements are analyzed at individual level and at team level using a modular hierarchical structure. Knowledge of the high-level team actions (such as ldquowalkingrdquo) improves the pertinence of our predictions on the low-level individual actions (foot is moving back and forth) and allows us to compensate for missing or noisy data produced by the feature extraction system. In terms of group behavior recognition, we propose a novel framework for online probabilistic plan recognition in cooperative multiagent systems: the Multiagent Hidden Markov mEmory Model (M-AHMEM), which is a dynamic Bayesian network. Experiments on an existing video database using different models of the human body show the feasibility of the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust and Accurate Object Tracking Under Various Types of Occlusions

    Page(s): 223 - 236
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2910 KB) |  | HTML iconHTML  

    We propose a complete solution to robust and accurate object tracking in face of various types of occlusions which pose many challenges to correct judgment of occlusion situation and proper update of target template. In order to tackle those challenges, we first propose a content-adaptive progressive occlusion analysis (CAPOA) algorithm. By combining the information provided by spatiotemporal context, reference target, and motion constraints together, the algorithm makes a clear distinction between the target and outliers. Accurate tracking of an occluded target is achieved by rectifying the target location using the variant-mask template matching (VMTM). In order to deal with template drift in the process of template update, we propose a drift-inhibitive masked Kalman appearance filter (DIMKAF) which accurately evaluates the influence of template drift when updating the masked template. Finally, we devise a local best match authentication (LBMA) algorithm to handle complete occlusions, so that we can achieve a much more trustworthy detection of the end of an arbitrarily long complete occlusion. Experimental results show that our proposed solution tracks targets reliably and accurately no matter when they are under: short-term, long-term, partial or complete occlusions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Design of Fast Wavelet Transform Algorithms With Low Memory Requirements

    Page(s): 237 - 248
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (718 KB) |  | HTML iconHTML  

    In this paper, a new algorithm to efficiently compute the two-dimensional wavelet transform is presented. This algorithm aims at low memory consumption and reduced complexity, meeting these requirements by means of line-by-line processing. In this proposal, we use recursion to automatically place the order in which the wavelet transform is computed. This way, we solve some synchronization problems that have not been tackled by previous proposals. Furthermore, unlike other similar proposals, our proposal can be straightforwardly implemented from the algorithm description. To this end, a general algorithm is given which is further detailed to allow its implementation with a simple filter bank or using the more efficient lifting scheme. We also include a new fast run-length encoder to be used along with the proposed wavelet transform for fast image compression and reduced memory consumption. When a 5-megapixel image is transformed, experimental results show that the proposed wavelet transform requires 200 times less memory and is five times faster than the regular one. If we consider the whole coding system, numerical results show that it achieves state-of-the-art performance with very low memory requirements and fast execution, becoming an interesting solution for resource-constrained devices such as mobile phones, digital cameras, and PDAs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dual Frame Motion Compensation With Uneven Quality Assignment

    Page(s): 249 - 256
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (486 KB) |  | HTML iconHTML  

    Video codecs that use motion compensation have shown PSNR gains from the use of multiple frame prediction, in which more than one past reference frame is available for motion estimation. In dual frame motion compensation, one short-term reference frame and one long-term reference frame are available for prediction. In this paper, we explore using dual frame motion compensation in two contexts. We first show that using a single fixed long-term reference frame in the context of a rate switching network can enhance video quality. Next, by periodically creating high-quality long-term reference frames, we show that the performance is superior to a standard dual frame technique that has the same average rate but no high-quality frames. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Accurate Low-Complexity Rate Control Algorithm Based on (\rho, E_{q}) -Domain

    Page(s): 257 - 262
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB) |  | HTML iconHTML  

    The standard H.264/AVC defines an efficient coding architecture both for coding applications where bandwidth or storage capacity is limited (e.g., video telephony or video conferencing over mobile channels and devices) and for applications that require high reconstruction quality and bit rate (e.g., HDTV). Since its main applications concern video communication over time-varying bandwidth channels, the bit rate has to be controlled with scalable algorithms that can be implemented on low resource devices. The paper describes a rate control algorithm that needs reduced memory area and complexity compared to other ones. The number of coded bits for each frame is accurately predicted through the percentage of null quantized transform coefficients, which is related to the quantization step via the energy of the quantized signal. It is possible to design a rate control algorithm based on this model that provides a good compression performance at a low computational cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Optimal Motion Estimation Based on Gradient-Based Adaptive Multilevel Successive Elimination

    Page(s): 263 - 267
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (175 KB) |  | HTML iconHTML  

    In this paper, we propose a fast and optimal solution for block motion estimation based on an adaptive multilevel successive elimination algorithm. This algorithm is accomplished by applying a modified multilevel successive elimination algorithm (SEA) with the elimination order determined by the sum of the gradient magnitudes of each subblock and the elimination process terminated by comparing the above sum with a threshold. In addition a fast approximate motion estimation method and the accumulated distortion scheme are employed to make the proposed algorithm even more efficiently. Experimental results show that the proposed adaptive multilevel successive elimination strategy (AdaMSEA) algorithm significantly outperforms other previous optimal motion estimation algorithms, including SEA, MSEA, and FGSE on a wide variety of video sequences. Finally, we modify the proposed AdaMSEA to an approximate motion estimation algorithm to achieve very fast computational speed, and the experimental results show superior performance of this approximate algorithm over some fast motion estimation algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An H.264/AVC Video Coder Based on a Multiple Description Scalar Quantizer

    Page(s): 268 - 272
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (994 KB) |  | HTML iconHTML  

    Transmission of encoded video sequences over unreliable networks usually requires the adoption of protection techniques to guarantee good reconstruction quality at the receiver side. Multiple description coding (MDC) strategies add reliability to real-time video applications, where retransmission is not possible and packet losses afflict several frames degrading the overall quality. This paper presents a novel MDC scheme based on the H.264/AVC standard and a multiple description scalar quantizer which encodes the residual information into two descriptions. Finally, objective and visual performances are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Novel Inter-Mode Decision Algorithm Based on Macroblock (MB) Tracking for the P-Slice in H.264/AVC Video Coding

    Page(s): 273 - 279
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB) |  | HTML iconHTML  

    We propose a fast macroblock (MB) mode prediction and decision algorithm based on temporal correlation for P-slices in the H.264/AVC video standard. There are nine 4times4 and 8times8 modes, and four 16 times 16 modes in the intra-mode prediction and 8 block types including SKIP mode exist for the best coding gain based on rate-distortion (R-D) optimization. This scheme gives rise to exhaustive computations (search) in the coding procedure. To overcome this problem, a thresholding method for fast inter-mode decision using an MB tracking scheme to find the most correlated block and the R-D cost of that block, are suggested for early inter-mode determination. An inter-mode candidate selection method is first suggested through the statistical analysis. Then, an adaptive inter-mode search algorithm is applied by using the R-D cost of the most correlated MB. Through comparative analysis, a speed-up factor of up to 70.83 % was verified with a negligible bit increment and a minimal loss of image quality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Technique for Evaluation of CCD Video-Camera Noise

    Page(s): 280 - 284
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB) |  | HTML iconHTML  

    This paper presents a technique to identify and measure the prominent sources of sensor noise in commercially available charge-coupled device (CCD) video cameras by analysis of the output images. Noise fundamentally limits the distinguishable content in an image and can significantly reduce the robustness of an image processing application. Although sources of image sensor noise are well documented, there has been little work on the development of techniques to identify and quantify the types of noise present in CCD video-camera images. A comprehensive noise model for CCD cameras was used to evaluate the technique on a commercially available CCD video camera. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Circuits and Systems Society Information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (30 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (31 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it