By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 1 • Date Jan. 2012

Filter Results

Displaying Results 1 - 20 of 20
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (66 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (41 KB)  
    Freely Available from IEEE
  • Flexible Adaptive Multiple Description Coding for Video Transmission

    Page(s): 1 - 11
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1512 KB) |  | HTML iconHTML  

    A channel adaptive multiple description video codec is presented with flexible redundancy allocation based on modeling and minimization of the end-to-end distortion. We employ a three-loop multiple description coding scheme for which we develop models that estimate the rate-distortion performance of the side encoders as well as the overall end-to-end distortion given channel statistics. A simple yet effective algorithm is formulated for determining appropriate levels of redundancy given a total bit rate and channel estimates in the form of packet error rates. The experimental results presented validate the proposed models over various channels conditions. The performance and adaptivity of the codec is evaluated through extensive simulations with a 22 wireless multiple input multiple output system. A gain of more than 10 dB can be achieved compared to a non-adaptive system and even larger gains can be had relative to typical single description transmissions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Packet Video Error Concealment With Auto Regressive Model

    Page(s): 12 - 27
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1309 KB) |  | HTML iconHTML  

    In this paper, auto regressive (AR) model is applied to error concealment for block-based packet video coding. In the proposed error concealment scheme, the motion vector for each corrupted block is first derived by any kind of recovery algorithms. Then each pixel within the corrupted block is replenished as the weighted summation of pixels within a square centered at the pixel indicated by the derived motion vector in a regression manner. Two block-dependent AR coefficient derivation algorithms under spatial and temporal continuity constraints are proposed respectively. The first one derives the AR coefficients via minimizing the summation of the weighted square errors within all the available neighboring blocks under the spatial continuity constraint. The confidence weight of each pixel sample within the available neighboring blocks is inversely proportional to the distance between the sample and the corrupted block. The second one derives the AR coefficients by minimizing the summation of the weighted square errors within an extended block in the previous frame along the motion trajectory under the temporal continuity constraint. The confidence weight of each extended sample is inversely proportional to the distance toward the corresponding motion aligned block whereas the confidence weight of each sample within the motion aligned block is set to be one. The regression results generated by the two algorithms are then merged to form the ultimate restorations. Various experimental results demonstrate that the proposed error concealment strategy is able to improve both the objective and subjective quality of the replenished blocks compared to other methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Motion Estimation System Using Dynamic Models for H.264/AVC Video Coding

    Page(s): 28 - 42
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (712 KB) |  | HTML iconHTML  

    H.264/AVC offers many coding tools for achieving high compression gains of up to 50% more than other standards. These tools dramatically increase the computational complexity of the block based motion estimation (BB-ME) which consumes up to 80% of the entire encoder's computations. In this paper, computationally efficient accurate skipping models are proposed to speed up any BB-ME algorithm. First, an accurate initial search center (ISC) is decided using a smart prediction technique. Thereafter, a dynamic early stop search termination (DESST) is used to decide if the block at the ISC position can be considered as a best match candidate block or not. If the DESST algorithm fails, a less complex style of the motion estimation algorithm which incorporates dynamic padding window size technique will be used. Further reductions in computations are achieved by combining the following two techniques. First, a dynamic partial internal stop search technique which utilizes an accurate adaptive threshold model is exploited to skip the internal sum of absolute difference operations between the current and the candidate blocks. Second, a dynamic external stop search technique greatly reduces the unnecessary operations by skipping all the irrelevant blocks in the search area. The proposed techniques can be incorporated in any block matching motion estimation algorithm. Computational complexity reduction is reflected in the amount of savings in the motion estimation encoding time. The novelty of the proposed techniques comes from their superior saving in computations with an acceptable degradation in both peak signal-to-noise ratio (PSNR) and bit-rate compared to the state of the art and the recent motion estimation techniques. Simulation results using H.264/AVC reference software (JM 12.4) show up to 98% saving in motion estimation time using the proposed techniques compared to the conventional full search algorithm with a negligible degradation in the PSNR by approximately 0.05- dB and a small increase in the required bits per frame by only 2%. Experimental results also prove the effectiveness of the proposed techniques if they are incorporated with any fast BB-ME technique such as fast extended diamond enhanced predictive zonal search and predictive motion vector field adaptive search technique. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Single-Pass-Based Localized Adaptive Interpolation Filter for Video Coding

    Page(s): 43 - 55
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (700 KB) |  | HTML iconHTML  

    Recently, an efficient coding tool named adaptive interpolation filtering (AIF) has been proposed to hybrid video coding scheme. By introducing Wiener filter into the fractional-pixel interpolation procedure, AIF can reduce the inter-prediction error and improve coding efficiency significantly. However, the training-based Wiener filter mechanism brings AIF an inherent multi-pass encoding structure, which imposes big burdens on the encoder in terms of huge computational complexity and memory access. In this paper, we propose a single-pass-based localized adaptive interpolation filtering (SPL-AIF) algorithm for video coding, which can reduce the complexity of AIF dramatically without sacrifice of its outstanding coding performance. The proposed SPL-AIF algorithm is based on the observation that there is a high correlation among optimal interpolation filters of consecutive frames, and different regions in a frame often possess different statistical characteristics. Accordingly, the proposed algorithm can be designed including two major parts. First, a competitive filter set which includes the optimal interpolation filters of several previous frames as well as the fixed H.264/AVC interpolation filters is built up for the coding of the current frame. Then a rate-distortion optimization criterion is used to select the best one at macroblock (MB) level. In order to reduce overhead, a predictive coding method is used to compress the filter signaling flag for each MB. Experimental results show that, by using the proposed algorithm, the encoding complexity can be reduced significantly while the average coding gain in Bjöntegaard distortion bit-rate reduction can be improved about 1% compared with the multi-pass AIF. The proposed method has been adopted into the Video Coding Expert Group Key Technology Area software. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cast Shadow Removal in a Hierarchical Manner Using MRF

    Page(s): 56 - 66
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1000 KB) |  | HTML iconHTML  

    In this paper, we present a novel method for shadow removal using Markov random fields (MRF). In our method, we first construct the shadow model in a hierarchical manner. At the pixel level, we use the Gaussian mixture model to model the behavior of cast shadows for every pixel in the HSV color space. The samples which are used to update the shadow model should satisfy a pre-classifier. This pre-classifier indicates the color feature of shadow in current frame. At the global level, we exploit the statistical features of shadow in the whole scene over several consecutive frames to make this pre-classifier accurate and adaptive to the change of shadow. Then, based on the shadow model, an MRF model is constructed for shadow removal. The main contribution of this paper is twofold. First, although our method is a chroma-based method, we make the pre-classifier accurate and adaptive to the change of shadow by using the statistical features of shadow at the global level. Moreover, tracking information can make this global-level statistical information more robust. Second, we construct an MRF model to represent the dependencies between the label of a pixel and the shadow models of its neighbors. Experimental results show that the proposed method is efficient and robust. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VSYNC: Bandwidth-Efficient and Distortion-Tolerant Video File Synchronization

    Page(s): 67 - 76
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (500 KB) |  | HTML iconHTML  

    We introduce video-sync (VSYNC), a video file synchronization system that efficiently uses a bidirectional communications link to maintain up-to-date video sources at remote ends to a desired resolution and distortion level. By automatically detecting and transmitting only the differences between video files, VSYNC is able to avoid unnecessary re-transmission of the entire video when there are only minor differences between video copies. A hierarchical hashing scheme is designed to allow synchronization to within some user-defined distortion, white being rate-efficient and computationally tractable. Distributed video coding is used to realize further rate savings when transmitting video updates. VSYNC is bandwidth-efficient and is useful in many scenarios including video backup, video sharing, and video authentication applications. Experimental results show that rate-savings ranging from 2× to 10× can be obtained by VSYNC with about 10% of the frames being edited, compared to re- transmitting the compressed video or using a file synchronization utility such as rsync. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling and Compressing 3-D Facial Expressions Using Geometry Videos

    Page(s): 77 - 90
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1267 KB) |  | HTML iconHTML  

    In this paper, we present a novel geometry video (GV) framework to model and compress 3-D facial expressions. GV bridges the gap of 3-D motion data and 2-D video, and provides a natural way to apply the well-studied video processing techniques to motion data processing. Our framework includes a set of algorithms to construct GVs, such as hole filling, geodesic-based face segmentation, expression-invariant parameterization (EIP), and GV compression. Our EIP algorithm can guarantee the exact correspondence of the salient features (eyes, mouth, and nose) in different frames, which leads to GVs with better spatial and temporal coherence than that of the conventional parameterization methods. By taking advantage of this feature, we also propose a new H.264/AVC-based progressive directional prediction scheme, which can provide further 10%-16% bitrate reductions compared to the original H.264/AVC applied for GV compression while maintaining good video quality. Our experimental results on real-world datasets demonstrate that GV is very effective for modeling the high-resolution 3-D expression data, thus providing an attractive way in expression information processing for gaming and movie industry. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Particle Filtering Based Estimation of Consistent Motion and Disparity With Reduced Search Points

    Page(s): 91 - 104
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1884 KB) |  | HTML iconHTML  

    A particle filtering based block-wise estimation method for estimation of motion in a video sequence and joint estimation of disparity and motion in a stereo video sequence is proposed. Parameters of motion and disparity of a block in a sequence are defined as a state, and evolution of the state with respect to the block index is tracked with particle filtering. The state is assumed to be dependent on the states of neighboring blocks. Estimated motion and disparity fields are consistent and suitable for intermediate frame or view generation. The particle filter provides a method to effectively sample the search space. The particles are concentrated in regions where the probability density function for the state has large values. Hence, the locations of the particles are good candidates for search points of a fast search method. The proposed method can estimate motion and disparity with a fraction of search points necessary for conventional estimation methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Frame-Rate Optical Flow System

    Page(s): 105 - 112
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (786 KB) |  | HTML iconHTML  

    In this paper, we develop a high-frame-rate (HFR) vision system that can estimate the optical flow in real time at 1000 f/s for 1024×1024 pixel images via the hardware implementation of an improved optical flow detection algorithm on a high-speed vision platform. Based on the Lucas-Kanade method, we adopt an improved gradient-based algorithm that can adaptively select a pseudo-variable frame rate according to the amplitude of the estimated optical flow to accurately detect the optical flow for objects moving at high and low speeds in the same image. The performance of our developed HFR optical flow system was verified through experimental results for high-speed movements such as a top's spinnning motion and a human's pitching motion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parametric OBMC for Pixel-Adaptive Temporal Prediction on Irregular Motion Sampling Grids

    Page(s): 113 - 127
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1080 KB) |  | HTML iconHTML  

    This paper adapts overlapped block motion compensation (OBMC) to suit variable block-size motion partitioning. The motion vectors (MVs) for various partitions are formalized as motion samples taken on an irregular grid. From this viewpoint, determining OBMC weights to associate with these samples becomes an under-determined problem since a distinct solution has to be sought for each prediction pixel. In this paper, we tackle this problem by expressing the optimal weights in closed form based on parametric signal assumptions. In particular, the computation of this solution requires only the geometric relations between the prediction pixel and its nearby block centers, leading to a generic framework capable of reconstructing temporal predictors from any irregularly sampled MVs. A modified implementation is also proposed to address the MV location uncertainty and to reduce computational complexity. Experimental results demonstrate that our scheme performs better than similar previous works, and when compared to the recently proposed Quadtree-based adaptive loop filter and enhanced adaptive interpolation filter, show a comparable gain. Furthermore, the combination of it with either of them gives a combined effect that is almost the sum of their separate improvements. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • View Interpolation for Medical Images on Autostereoscopic Displays

    Page(s): 128 - 137
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (939 KB) |  | HTML iconHTML  

    We present an approach for efficient rendering and transmitting views to a high-resolution autostereoscopic display for medical purposes. Displaying biomedical images on an autostereoscopic display poses different requirements than in a consumer case. For medical usage, it is essential that the perceived image represents the actual clinical data and offers sufficiently high quality for diagnosis or understanding. Autostereoscopic display of multiple views introduces two hurdles: transmission of multi-view data through a bandwidth-limited channel and the computation time of the volume rendering algorithm. We address both issues by generating and transmitting limited set of views enhanced with a depth signal per view. We propose an efficient view interpolation and rendering algorithm at the receiver side based on texture+depth data representation, which can operate with a limited amount of views. We study the main artifacts that occur during rendering-occlusions, and we quantify them first for a synthetic model and then for real-world biomedical data. The experimental results allow us to quantify the peak signal-to-noise ratio for rendered texture and depth as well as the amount of disoccluded pixels as a function of the angle between surrounding cameras. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video Coding With Rate-Distortion Optimized Transform

    Page(s): 138 - 151
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1615 KB) |  | HTML iconHTML  

    Block-based discrete cosine transform (DCT) has been successfully adopted into several international image/video coding standards, e.g., MPEG-2, H.264/AVC, as it can achieve a good tradeoff between performance and complexity. Although DCT theoretically approximates the optimum Karhunen-Loève transform under first-order Markov conditions, one fixed set of transform basis functions (TBF) cannot handle all the cases efficiently due to the non-stationary nature of video contents. To further improve the performance of block-based transform coding, in this paper, we present the design of rate-distortion optimized transform (RDOT) which contributes to both intraframe and interframe coding. The most important property which makes a difference between RDOT and the conventional DCT is that, in the proposed method, transform is implemented with multiple TBF candidates which are obtained from off-line training. With this feature, for coding each residual block, the encoder is capable to select the optimal set of TBF in terms of rate-distortion performance, and better energy compaction is achieved in the transform domain. To obtain an optimum group of candidate TBF, we have developed a two-step iterative optimization technique for the off-line training, with which the TBF candidates are refined at each iteration until the training process becomes converged. Moreover, analysis on the optimal group of candidate TBF is also presented in this paper, with a detailed description of a practical implementation for the proposed algorithm on the latest VCEG key technical area software platform. Extensive experimental results show that, compared with the conventional DCT-based transform scheme adopted into the state-of-the-art H.264/AVC video coding standard, significant improvement of coding performance has been achieved for both intraframe and interframe coding with our proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient Algorithm for Focus Measure Computation in Constant Time

    Page(s): 152 - 156
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (599 KB) |  | HTML iconHTML  

    This letter presents an efficient algorithm for focus measure computation, in constant time, to estimate depth map using image sequences acquired at varying focus. Two major factors that complicate focus measure computation include neighborhood support and gradient detection for oriented intensity variations. We present a distinct focus measure based on steerable filters that is invariant to neighborhood size and accomplishes fast depth map estimation at a considerably faster speed compared to other well-documented methods. Steerable filters represent architecture to synthesize filters of arbitrary orientation using a linear combination of basis filters. Such synthesis is helpful to analytically determine the filter output as a function of orientation. Steerable filters remove inherent limitations of traditional gradient detection techniques which perform inadequately for oriented intensity variations and low textured regions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Space-Mapping Method for Object Location Estimation Adaptive to Camera Setup Changes for Vision-Based Automation Applications

    Page(s): 157 - 162
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (341 KB) |  | HTML iconHTML  

    A new space-mapping method for object location estimation, which is adaptive to camera setup changes, for use in various automation applications is proposed. The location of an object appearing in an image is estimated by mapping image coordinates of object points to corresponding real-world coordinates using a mapping table, which is constructed in two stages, with the first stage for establishing a basic table using bilinear interpolation in the camera manufacturing environment and the second for adapting the table to changes of camera heights and orientations in the application field. Analytic equations for table adaptation are derived by skillful utilization of both image formation and camera geometry properties. Good experimental results are shown to prove the feasibility of the proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Special issue on circuits, systems and algorithms for compressive sensing

    Page(s): 163
    Save to Project icon | Request Permissions | PDF file iconPDF (102 KB)  
    Freely Available from IEEE
  • Quality without compromise [advertisement]

    Page(s): 164
    Save to Project icon | Request Permissions | PDF file iconPDF (324 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (32 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it