By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 7 • Date July 2009

Filter Results

Displaying Results 1 - 22 of 22
  • Table of contents

    Publication Year: 2009 , Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (85 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Publication Year: 2009 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (39 KB)  
    Freely Available from IEEE
  • Modeling and Analysis of Distortion Caused by Markov-Model Burst Packet Losses in Video Transmission

    Publication Year: 2009 , Page(s): 917 - 931
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1255 KB) |  | HTML iconHTML  

    This paper addresses the problem of distortion modeling for video transmission over burst-loss channels characterized by a finite-state Markov chain. Based on a detailed analysis of the error propagation and the bursty losses, a distortion trellis model is proposed, enabling us to estimate at the both the frame level and sequence level the expected mean-square error (MSE) distortion caused by Markov-model burst packet losses. The model takes into account the temporal dependencies induced by both the motion-compensated coding scheme and the Markov-model channel losses. The model is applicable to most block-based motion-compensated encoders, and most Markov-model lossy channels as long as the loss pattern probabilities for that channel is computable. Based on the study of the decaying behavior of the error propagation, a sliding window algorithm is developed to perform the MSE estimation with low complexity. Simulation results show that the proposed models are accurate for all tested average loss rates and average burst lengths. Based on the experimental results, the proposed techniques are used to analyze the impact of factors such as average burst length on the average decoded video quality. The proposed model is further extended to a more general form, and the modeled distortion is compared with the data produced from realistic networks loss traces. The experiment results demonstrate that the proposed model is also accurate in estimating the expected distortion for video transmission in real networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • De-Interlacing Algorithm Using Spatial-Temporal Correlation-Assisted Motion Estimation

    Publication Year: 2009 , Page(s): 932 - 944
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2138 KB) |  | HTML iconHTML  

    De-interlacing algorithms are used to convert interlaced video into progressive scanning format. The motion-adaptive technique provides acceptable picture quality, but the quality of the motion area still needs to be improved. Among the various de-interlacing techniques, the motion-compensated de-interlacing technique provides the best performance if the estimated motion information is reliable. However, it suffers from inaccurate motion estimation, and the weak error protection thus deteriorates the visual quality. This paper presents a motion-compensated de-interlacing algorithm with highly accurate motion estimation and robust error detection. In order to obtain more accurate motion information, spatial-temporal correlation-assisted motion estimation is proposed. The spatial and temporal correlations among the motion vectors (MVs) are exploited to find the true motion of the object. In order to reject incorrect temporal information, a hierarchical MV reliability verification is provided. The possible defects in both large and small areas can be detected effectively. The experimental results show that the proposed algorithm outperforms existing algorithms and produces high quality de-interlaced results in various video sequences. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Video Stabilization Based on Particle Filter Tracking of Projected Camera Motion

    Publication Year: 2009 , Page(s): 945 - 954
    Cited by:  Papers (24)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1486 KB) |  | HTML iconHTML  

    Video stabilization is an important technique in digital cameras. Its impact increases rapidly with the rising popularity of handheld cameras and cameras mounted on moving platforms (e.g., cars). Stabilization of two images can be viewed as an image registration problem. However, to ensure the visual quality of the whole video, video stabilization has a particular emphasis on the accuracy and robustness over long image sequences. In this paper, we propose a novel technique for video stabilization based on the particle filtering framework. We extend the traditional use of particle filters in object tracking to tracking of the projected affine model of the camera motions. We rely on the inverse of the resulting image transform to obtain a stable video sequence. The correspondence between scale-invariant feature transform points is used to obtain a crude estimate of the projected camera motion. We subsequently postprocess the crude estimate with particle filters to obtain a smooth estimate. It is shown both theoretically and experimentally that particle filtering can reduce the error variance compared to estimation without particle filtering. The superior performance of our algorithm over other methods for video stabilization is demonstrated through computer simulated experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Audio-Guided Video-Based Face Recognition

    Publication Year: 2009 , Page(s): 955 - 964
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (690 KB) |  | HTML iconHTML  

    In this paper, we develop a new video-to-video face recognition algorithm. The major advantage of the video-based method is that more information is available in a video sequence than in a single image. In order to take advantage of the large amount of information in the video sequence and at the same time overcome the processing speed and data size problems, we develop several new techniques including temporal and spatial frame synchronization, multilevel discriminant subspace analysis, and multiclassifier integration for video sequence processing. An aligned video sequence for each person is first obtained by applying temporal and spatial synchronization, which effectively establishes the face correspondence using both audio and video information; then multilevel discriminant subspace analysis or multiclassifier integration is employed for further analysis based on the synchronized sequence. The method preserves most of the temporal-spatial information contained in a video sequence. Extensive experiments on the XM2VTS database clearly show the superiority of our new algorithms with near-perfect classification results (99.3%) obtained. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generalized Butterfly Graph and Its Application to Video Stream Authentication

    Publication Year: 2009 , Page(s): 965 - 977
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2207 KB) |  | HTML iconHTML  

    This paper presents the generalized butterfly graph (GBG) and its application to video stream authentication. Compared with the original butterfly graph, the proposed GBG provides significantly increased flexibility, which is necessary for streaming applications, including supporting arbitrary bit-rate budget for authentication and arbitrary number of video packets. Within the GBG design, the problem of constructing an authentication graph is defined as follows: given the total number of packets to protect, the expected packet loss rate for the network, and the available overhead budget, how should one design the authentication graph to maximize the probability that the received packets are verifiable? Furthermore, given the fact that media packets are typically of unequal importance, we explore two variants of the GBG authentication, packet sorting and unequal authentication protection, which apply unequal treatment to different packets based on their importance. Lastly, we examine how the proposed GBG authentication can be applied within the context of rate-distortion-authentication (R-D-A) optimized streaming: given a media stream protected by GBG authentication, the R-D-A optimized streaming technique computes an optimized transmission schedule by recognizing and accounting for the authentication dependencies in the GBG authentication graph. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generalized Embedding of Multiplicative Watermarks

    Publication Year: 2009 , Page(s): 978 - 988
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (652 KB) |  | HTML iconHTML  

    This paper constructs a class of generalized embeddings of multiplicative watermarks. Ordinary multiplicative and additive methods are included as special cases. The new watermarks automatically adapt to the local contents of host signals, benefiting the perceptual quality. The decoding makes use of the optimal generalized correlation detector. The host interference is precanceled at the embedder side and very high gains are obtained in terms of decoding capability. We develop performance analysis for this new class of embeddings. It turns out that the plain multiplicative watermark is far outperformed by the new embedding. Further, the multiplicative watermark with host interference rejection is still suboptimal. The best embeddings and configurations are specified for typical scenarios. Our construction and performance analyses of the generalized embedding offer a class of new methods. The construction and analyses are confirmed by empirical experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reversible Watermarking Algorithm Using Sorting and Prediction

    Publication Year: 2009 , Page(s): 989 - 999
    Cited by:  Papers (68)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (537 KB) |  | HTML iconHTML  

    This paper presents a reversible or lossless watermarking algorithm for images without using a location map in most cases. This algorithm employs prediction errors to embed data into an image. A sorting technique is used to record the prediction errors based on magnitude of its local variance. Using sorted prediction errors and, if needed, though rarely, a reduced size location map allows us to embed more data into the image with less distortion. The performance of the proposed reversible watermarking scheme is evaluated using different images and compared with four methods: those of Kamstra and Heijmans, Thodi and Rodriguez, and Lee et al. The results clearly indicate that the proposed scheme can embed more data with less distortion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Statistical Framework for Video Decoding Complexity Modeling and Prediction

    Publication Year: 2009 , Page(s): 1000 - 1013
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (432 KB) |  | HTML iconHTML  

    Video decoding complexity modeling and prediction is an increasingly important issue for efficient resource utilization in a variety of applications, including task scheduling, receiver-driven complexity shaping, and adaptive dynamic voltage scaling. In this paper we present a novel view of this problem based on a statistical framework perspective. We explore the statistical structure (clustering) of the execution time required by each video decoder module (entropy decoding, motion compensation, etc.) in conjunction with complexity features that are easily extractable at encoding time (representing the properties of each module's input source data). For this purpose, we employ Gaussian mixture models (GMMs) and an expectation-maximization algorithm to estimate the joint execution-time-feature probability density function (PDF). A training set of typical video sequences is used for this purpose in an offline estimation process. The obtained GMM representation is used in conjunction with the complexity features of new video sequences to predict the execution time required for the decoding of these sequences. Several prediction approaches are discussed and compared. The potential mismatch between the training set and new video content is addressed by adaptive online joint-PDF re-estimation. An experimental comparison is performed to evaluate the different approaches and compare the proposed prediction scheme with related resource prediction schemes from the literature. The usefulness of the proposed complexity-prediction approaches is demonstrated in an application of rate-distortion-complexity optimized decoding. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-Time Illegal Parking Detection in Outdoor Environments Using 1-D Transformation

    Publication Year: 2009 , Page(s): 1014 - 1024
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2511 KB) |  | HTML iconHTML  

    With decreasing costs of high-quality surveillance systems, human activity detection and tracking has become increasingly practical. Accordingly, automated systems have been designed for numerous detection tasks, but the task of detecting illegally parked vehicles has been left largely to the human operators of surveillance systems. We propose a methodology for detecting this event in real time by applying a novel image projection that reduces the dimensionality of the data and, thus, reduces the computational complexity of the segmentation and tracking processes. After event detection, we invert the transformation to recover the original appearance of the vehicle and to allow for further processing that may require 2-D data. We evaluate the performance of our algorithm using the i-LIDS vehicle detection challenge datasets as well as videos we have taken ourselves. These videos test the algorithm in a variety of outdoor conditions, including nighttime video and instances of sudden changes in weather. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Region-of-Interest Determination Based on User Attention Model Through Visual Rhythm Analysis

    Publication Year: 2009 , Page(s): 1025 - 1038
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1949 KB) |  | HTML iconHTML  

    Region-of-interest (ROI) determination is very important for video processing and it is desirable to find a simple method to identify the ROI. Along this direction, this paper investigates a user attention model based on visual rhythm analysis for automatic determination of ROI in a video. The visual rhythm, which is an abstraction of a video, is a thumbnail version of a video by a 2-D image that captures the temporal information of a video sequence. Four sampling lines, including diagonal, anti-diagonal, vertical, and horizontal lines, are employed to obtain four visual rhythm maps in order to analyze the location of the ROI from video data. Via the variation on visual rhythms, object and camera motions can be efficiently distinguished. As for hardware design consideration, the proposed scheme can accurately extract ROI with very low computational complexity for real-time applications. The promising results from the experiments demonstrate that the moving object is effectively and efficiently extracted. Finally, we present a way to use flexible macroblock ordering in combination with ROI determination as a preprocessing step for H.264/AVC video coding, and experimental results show the quality of ROI regions is significantly enhanced. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Block Matching Motion Estimation Using Multilevel Intra- and Inter-Subblock Features Subblock-Based SATD

    Publication Year: 2009 , Page(s): 1039 - 1043
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (165 KB) |  | HTML iconHTML  

    Block matching based motion estimation employs conventionally pixel-based sum of absolute differences (SAD) metric for distortion measure. Many fast matching mechanisms have been developed to lower the computational load for block matching, such as those exploiting block feature based or attribute-based SAD calculations. By examining a subblock-sum-based SAD measure, where the subblock-sums can be considered to be intra-subblock features, we propose both intra- and inter-subblock features based SAD measure for more effective block matching in terms of achieving better coding efficiency. Interestingly, the proposed feature based SAD measure can be interpreted as subblock-based sum of absolute Hadamard transformed differences (SATD). The new features can be constructed in a multilevel structure, which may provide a flexible and scalable means for a tradeoff between computation load and coding efficiency. Encoding results are shown to compare the proposed scheme against other relevant feature based SAD measures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selective Search Area Reuse Algorithm for Low External Memory Access Motion Estimation

    Publication Year: 2009 , Page(s): 1044 - 1050
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (870 KB) |  | HTML iconHTML  

    In motion estimation for video codec, reducing the amount of external memory access is critical to reduce power consumption and to minimize performance degradation. Previous search area reuse algorithms to reduce the memory access still suffer from coding efficiency degradation in fast motion video. Previously, we proposed a selective search area reuse (SSAR) algorithm to reduce the amount of external memory access with minimal coding efficiency degradation. In this letter, we extend SSAR algorithm to multiple reference frame motion estimation with a method to utilize multiple on-chip memories. Then, we propose a frame-level dynamic search range algorithm based on the SSAR algorithm. Finally, we propose a memory usage switching method to increase the utilization of the limited-size on-chip memory. Experimental results show that the proposed algorithm with a search range of 16 achieves 28.64-56.24% reduction according to the number of on-chip memories in multiple reference frames. In the results of the Foreman video sequence, our algorithm operating with a fixed-size on-chip memory compensated for quality degradation by up to 2.7 dB in the frames of fast camera motion, and reduced the amount of memory access by 22.6% with a peak signal-to-noise ratio gain of 1 dB in the frames of camera shaking. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Fidelity RGB Video Coding Using Adaptive Inter-Plane Weighted Prediction

    Publication Year: 2009 , Page(s): 1051 - 1056
    Cited by:  Papers (16)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (200 KB) |  | HTML iconHTML  

    This letter presents an efficient video coding algorithm in the RGB color space. Currently, most video coding algorithms are based on a decorrelated color space such as YUV or YCbCr, which has reduced inter-color redundancy. For high-fidelity video applications, however, it is essential to maintain the original signal fidelity without any mismatch in color space conversion. In this context, the H.264/AVC High 4:4:4 Intra/Predictive profiles support the RGB color space to satisfy increasing demand for high-fidelity video coding. However, very little effort has been made to utilize inter-color correlation to increase RGB coding efficiency. In pursuit of both high-fidelity and coding efficiency, we propose an adaptive inter-plane-weighted prediction algorithm exploiting inter-color redundancy of the RGB signal. We integrate the proposed algorithm into H.264/AVC High 4:4:4 Intra/Predictive profile reference software for simulation, and show that the proposed algorithm increases RGB video coding efficiency at average 0.80 dB and 0.75 dB, compared with the existing H.264/AVC High 4:4:4 Intra profile RGB and YCbCr 4:4:4 coding, respectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modified Steepest-Descent for Bit Allocation in Strongly Dependent Video Coding

    Publication Year: 2009 , Page(s): 1057 - 1062
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (549 KB) |  | HTML iconHTML  

    The problem of efficient bit allocation for dependent video coding within the context of ascribing a quantizer to each macroblock (MB) within a frame is addressed in this letter. Strongly dependent MBs exhibit significant discontinuities on their operational distortion-rate (D-R) convex hulls that nullify the assumptions made for the use of the coordinate-wise steepest-descent (SD) algorithm. A qualified slope concept and a near-neighborhood look-ahead modification are introduced to circumvent these problems and are experimentally evaluated on a set of H.263+ encodings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting Cross-Fades in Interlaced Video With 3:2 Film Cadence

    Publication Year: 2009 , Page(s): 1063 - 1067
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (217 KB) |  | HTML iconHTML  

    This letter presents an algorithm for detecting cross-fade scene changes in video carrying 3:2 or mixed film cadence. There are many existing methods for detecting gradual video transitions, but none of the current literature addresses the complication of film cadence. The differences between video and film capture and the mechanics of the telecine transfer process used to convert 24-Hz film to the main international television standards alter the temporal properties in a nonlinear way that makes cross-fades more difficult to detect with existing methods. An algorithm is proposed to address this problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Optimality of Motion-Based Particle Filtering

    Publication Year: 2009 , Page(s): 1068 - 1072
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (527 KB) |  | HTML iconHTML  

    Particle filters have revolutionized object tracking in video sequences. The conventional particle filter, also called the CONDENSATION filter, uses the state transition distribution as the proposal distribution, from which the particles are drawn at each iteration. However, the transition distribution does not take into account the current observations, and thus many particles can be wasted in low likelihood regions. One of the most popular methods to improve the performance of particle filters relied on the motion-based proposal density. Although the motivation for motion-based particle filters could be explained on an intuitive level, up until now a mathematical rationale for the improved performance of motion-based particle filters has not been presented. In this letter, we investigate the performance of motion-based particle filters and provide an analytical justification of their superiority over the classical CONDENSATION filter. We rely on the characterization of the optimal proposal density, which minimizes the variance of the particles'weights. However, this density does not admit an analytical expression, making direct sampling from this optimal distribution impossible. We use the Kullback-Leibler (KL) divergence as a similarity measure between density functions and denote a particle filter as superior if the KL divergence between its proposal and the optimal proposal function is lower. We subsequently prove that under mild conditions on the estimated motion vector, the motion-based particle filter outperforms the CONDENSATION filter, in terms of the KL performance measure. Simulation results are presented to support the theoretical analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cross-Based Local Stereo Matching Using Orthogonal Integral Images

    Publication Year: 2009 , Page(s): 1073 - 1079
    Cited by:  Papers (68)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (590 KB) |  | HTML iconHTML  

    We propose an area-based local stereo matching algorithm for accurate disparity estimation across all image regions. A well-known challenge to local stereo methods is to decide an appropriate support window for the pixel under consideration, adapting the window shape or the pixelwise support weight to the underlying scene structures. Our stereo method tackles this problem with two key contributions. First, for each anchor pixel an upright cross local support skeleton is adaptively constructed, with four varying arm lengths decided on color similarity and connectivity constraints. Second, given the local cross-decision results, we dynamically construct a shape-adaptive full support region on the fly, merging horizontal segments of the crosses in the vertical neighborhood. Approximating image structures accurately, the proposed method is among the best performing local stereo methods according to the benchmark Middlebury stereo evaluation. Additionally, it reduces memory consumption significantly thanks to our compact local cross representation. To accelerate matching cost aggregation performed in an arbitrarily shaped 2-D region, we also propose an orthogonal integral image technique, yielding a speedup factor of 5-15 over the straightforward integration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2010 IEEE International Symposium on Circuits and Systems (ISCAS2010)

    Publication Year: 2009 , Page(s): 1080
    Save to Project icon | Request Permissions | PDF file iconPDF (701 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Publication Year: 2009 , Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Publication Year: 2009 , Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it