By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 12 • Date Dec. 2010

Filter Results

Displaying Results 1 - 25 of 32
  • Table of contents

    Publication Year: 2010 , Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (72 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Publication Year: 2010 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (43 KB)  
    Freely Available from IEEE
  • Special Section on the Joint Call for Proposals on High Efficiency Video Coding (HEVC) Standardization

    Publication Year: 2010 , Page(s): 1661 - 1666
    Cited by:  Papers (22)  |  Patents (2)
    Save to Project icon | Request Permissions | PDF file iconPDF (790 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Video Coding Using a Simplified Block Structure and Advanced Coding Techniques

    Publication Year: 2010 , Page(s): 1667 - 1675
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (849 KB) |  | HTML iconHTML  

    This paper describes a new video coding scheme based on a simplified block structure that significantly outperforms the coding efficiency of the ISO/IEC 14496-10 ITU-T H.264 advanced video coding (AVC) standard. Its conceptual design is similar to a typical block-based hybrid coder applying prediction and subsequent prediction error coding. The basic coding unit is an 8 × 8 block for inter, and an 8 × 8 or a 16 × 16 block for intra, instead of the usual 16 × 16 macroblock. No larger block sizes are considered for prediction and transform. Based on this simplified block structure, the coding scheme uses simple and fundamental coding tools with optimized encoding algorithms. In particular, the motion representation is based on a minimum partitioning with blocks sharing motion borders. In addition, compared to AVC, the new and improved coding techniques include: block-based intensity compensation, motion vector competition, adaptive motion vector resolution, adaptive interpolation filters, edge-based intra prediction and enhanced chrominance prediction, intra template matching, larger trans forms and adaptive switchable transforms selection for intra and inter blocks, and nonlinear and frame-adaptive de-noising loop filters. Finally, the entropy coder uses a generic flexible zero tree representation applied to both motion and texture data. Attention has also been given to algorithm designs that facilitate parallelization. Compared to AVC, the new coding scheme offers clear benefits in terms of subjective video quality at the same bit rate. Objective quality improvements are equally significant. At the same quality, an average bit-rate reduction of 31% compared to AVC is reported. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video Compression Using Nested Quadtree Structures, Leaf Merging, and Improved Techniques for Motion Representation and Entropy Coding

    Publication Year: 2010 , Page(s): 1676 - 1687
    Cited by:  Papers (38)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1696 KB) |  | HTML iconHTML  

    Abstract-A video coding architecture is described that is based on nested and pre-configurable quadtree structures for flexible and signal-adaptive picture partitioning. The primary goal of this partitioning concept is to provide a high degree of adaptability for both temporal and spatial prediction as well as for the purpose of space-frequency representation of prediction residuals. At the same time, a leaf merging mechanism is included in order to prevent excessive partitioning of a picture into prediction blocks and to reduce the amount of bits for signaling the prediction signal. For fractional-sample motion-compensated prediction, a fixed-point implementation of the maximal-order minimum-support algorithm is presented that uses a combination of infinite impulse response and FIR filtering. Entropy coding utilizes the concept of probability interval partitioning entropy codes that offers new ways for parallelization and enhanced throughput. The presented video coding scheme was submitted to a joint call for proposals of ITU-T Visual Coding Experts Group and ISO/IEC Moving Picture Experts Group and was ranked among the five best performing proposals, both in terms of subjective and objective quality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High Performance, Low Complexity Video Coding and the Emerging HEVC Standard

    Publication Year: 2010 , Page(s): 1688 - 1697
    Cited by:  Papers (38)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1024 KB) |  | HTML iconHTML  

    This paper describes a low complexity video codec with high coding efficiency. It was proposed to the high efficiency video coding (HEVC) standardization effort of moving picture experts group and video coding experts group, and has been partially adopted into the initial HEVC test model under consideration design. The proposal utilizes a quadtree-based coding structure with support for macroblocks of size 64 × 64, 32 × 32, and 16 × 16 pixels. Entropy coding is performed using a low complexity variable length coding scheme with improved context adaptation compared to the context adaptive variable length coding design in H.264/AVC. The proposal's interpolation and deblocking filter designs improve coding efficiency, yet have low complexity. Finally, intra-picture coding methods have been improved to provide better subjective quality than H.264/AVC. The subjective quality of the proposed codec has been evaluated extensively within the HEVC project, with results indicating that similar visual quality to H.264/AVC High Profile anchors is achieved, measured by mean opinion score, using significantly fewer bits. Coding efficiency improvements are achieved with lower complexity than the H.264/AVC Baseline Profile, particularly suiting the proposal for high resolution, high quality applications in resource-constrained environments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Hybrid Video Coder Based on Extended Macroblock Sizes, Improved Interpolation, and Flexible Motion Representation

    Publication Year: 2010 , Page(s): 1698 - 1708
    Cited by:  Papers (15)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1276 KB) |  | HTML iconHTML  

    This paper describes a video coding technology proposal submitted by Qualcomm in response to a joint call for proposals (CfP) issued by ITU-T SG16 Q.6 (VCEG) and ISO/IEC JTC1/SC29/WG11 (MPEG) in January 2010. The proposed video codec follows a hybrid coding approach based on temporal prediction, followed by transform, quantization, and entropy coding of the residual. Some of its key features are extended block sizes (up to 64 × 64), single pass switched interpolation filters with offsets, mode-dependent directional transforms for intra-coding, luma and chroma high precision filtering, geometric motion partitions, adaptive motion vector resolution and efficient 16-point transforms. It also incorporates internal bit-depth increase and modified quadtree-based adaptive loop filtering. Simulation results are presented to demonstrate the high compression efficiency achieved by the proposed video codec at the expense of moderate increase in encoding and decoding complexity compared to the advanced video coding standard (AVC/H.264). For the random access and low delay configurations, it achieved average bit rate reductions of 30.9% and 33.0% for equivalent peak signal-to-noise ratio, respectively, compared to the corresponding AVC anchors. The proposed codec scored highly in both subjective evaluations and objective metrics and was among the best-performing CfP proposals. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved Video Compression Efficiency Through Flexible Unit Representation and Corresponding Extension of Coding Tools

    Publication Year: 2010 , Page(s): 1709 - 1720
    Cited by:  Papers (37)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1639 KB) |  | HTML iconHTML  

    This paper proposes a novel video compression scheme based on a highly flexible hierarchy of unit representation which includes three block concepts: coding unit (CU), prediction unit (PU), and transform unit (TU). This separation of the block structure into three different concepts allows each to be optimized according to its role; the CU is a macroblock-like unit which supports region splitting in a manner similar to a conventional quadtree, the PU supports nonsquare motion partition shapes for motion compensation, while the TU allows the transform size to be defined independently from the PU. Several other coding tools are extended to arbitrary unit size to maintain consistency with the proposed design, e.g., transform size is extended up to 64 × 64 and intraprediction is designed to support an arbitrary number of angles for variable block sizes. Other novel techniques such as a new noncascading interpolation Alter design allowing arbitrary motion accuracy and a leaky prediction technique using both open-loop and closed-loop predictors are also introduced. The video codec described in this paper was a candidate in the competitive phase of the high-efficiency video coding (HEVC) standardization work. Compared to H.264/AVC, it demonstrated bit rate reductions of around 40% based on objective measures and around 60% based on subjective testing with 1080 p sequences. It has been partially adopted into the first standardization model of the collaborative phase of the HEVC effort. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application-Centric Routing for Video Streaming Over MultiHop Wireless Networks

    Publication Year: 2010 , Page(s): 1721 - 1734
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2067 KB) |  | HTML iconHTML  

    Most existing works on routing for video transmission over multihop wireless networks only focus on how to satisfy the network-oriented quality-of-service (QoS), such as through put, delay, and packet loss rate rather than application-oriented QoS such as the user-perceived video quality. Although there are some research efforts which use application-centric video quality as the routing metric, they either calculate the video quality based on some predefined rate-distortion function or model without considering the impact of video coding and decoding (including error concealment) on routing, or use exhaustive search or heuristic methods to find the optimal path, leading to high computational complexity and/or suboptimal solutions. In this paper, we propose an application-centric routing framework for real-time video transmission in multihop wireless networks, where expected video distortion is adopted as the routing metric. The major contributions of this paper are: 1) the development of an efficient routing algorithm with the routing metric expressed in terms of the expected video distortion and being calculated on-the-fly, and 2) the development of a quality-driven cross-layer optimization framework to enhance the flexibility and robustness of routing by the joint optimization of routing path selection and video coding, thereby maximizing the user-perceived video quality under a given video playback delay constraint. Both theoretical and experimental results demonstrate that the proposed quality-driven application-centric routing approach can achieve a superior performance over existing network-centric routing approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Direct Techniques for Optimal Sub-Pixel Motion Accuracy Estimation and Position Prediction

    Publication Year: 2010 , Page(s): 1735 - 1744
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1798 KB) |  | HTML iconHTML  

    Direct techniques for the optimal resolution estimation and position prediction of subpel motion vectors (MVs) based on integer-pel MVs are investigated in this paper. Although it is common to determine the optimal MV position by fitting a local error surface using integer-pel MVs, the characteristics of the error surface have not been thoroughly studied in the past. Here, we use an approximate condition number of the Hessian matrix of the error surface to characterize its shape in a local region. By exploiting this shape information, we propose a block-based subpel MV resolution estimation method that allows each block to choose its optimal subpel MV resolution for the optimal rate-distortion (R-D) performance adaptively. Furthermore, we propose two MV position prediction schemes for ill and well-conditioned error surfaces, respectively. All proposed techniques are direct methods, where no iteration is required. Experimental results are given to show the R-D performance of the proposed subpel MV resolution estimation and position prediction schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recognizing Cartoon Image Gestures for Retrieval and Interactive Cartoon Clip Synthesis

    Publication Year: 2010 , Page(s): 1745 - 1756
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2405 KB) |  | HTML iconHTML  

    In this paper, we propose a new method to recognize gestures of cartoon images with two practical applications, i.e., content-based cartoon image retrieval and interactive cartoon clip synthesis. Upon analyzing the unique properties of four types of features including global color histogram, local color histogram (LCH), edge feature (EF), and motion direction feature (MDF), we propose to employ different features for different purposes and in various phases. We use EF to define a graph and then refine its local structure by LCH. Based on this graph, we adopt a transductive learning algorithm to construct local patches for each cartoon image. A spectral method is then proposed to optimize the local structure of each patch and then align these patches globally. MDF is fused with EF and LCH and a cartoon gesture space is constructed for cartoon image gesture recognition. We apply the proposed method to content-based cartoon image retrieval and interactive cartoon clip synthesis. The experiments demonstrate the effectiveness of our method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Color Distribution Information for the Reduced-Reference Assessment of Perceived Image Quality

    Publication Year: 2010 , Page(s): 1757 - 1769
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1565 KB) |  | HTML iconHTML  

    Reduced-reference systems can predict in real-time the perceived quality of images for digital broadcasting, only requiring that a limited set of features, extracted from the original undistorted signals, is transmitted together with the image data. This paper uses descriptors based on the color correlogram, analyzing the alterations in the color distribution of an image as a consequence of the occurrence of distortions, for the reduced reference data. The processing architecture relies on a double layer at the receiver end. The first layer identifies the kind of distortion that may affect the received signal. The second layer deploys a dedicated prediction module for each type of distortion; every predictor yields an objective quality score, thus completing the estimation process. Computational-intelligence models are used extensively to support both layers with empirical training. The double-layer architecture implements a general purpose image quality assessment system, not being tied up to specific distortions and, at the same time, it allows us to benefit from the accuracy of specific, distortion-targeted metrics. Experimental results based on subjective quality data confirm the general validity of the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Hierarchical Bayesian Generation Framework for Vacant Parking Space Detection

    Publication Year: 2010 , Page(s): 1770 - 1785
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1925 KB) |  | HTML iconHTML  

    In this paper, from the viewpoint of scene under standing, a three-layer Bayesian hierarchical framework (BHF) is proposed for robust vacant parking space detection. In practice, the challenges of vacant parking space inference come from dramatic luminance variations, shadow effect, perspective distortion, and the inter-occlusion among vehicles. By using a hidden labeling layer between an observation layer and a scene layer, the BHF provides a systematic generative structure to model these variations. In the proposed BHF, the problem of luminance variations is treated as a color classification problem and is tack led via a classification process from the observation layer to the labeling layer, while the occlusion pattern, perspective distortion, and shadow effect are well modeled by the relationships between the scene layer and the labeling layer. With the BHF scheme, the detection of vacant parking spaces and the labeling of scene status are regarded as a unified Bayesian optimization problem subject to a shadow generation model, an occlusion generation model, and an object classification model. The system accuracy was evaluated by using outdoor parking lot videos captured from morning to evening. Experimental results showed that the proposed framework can systematically determine the vacant space number, efficiently label ground and car regions, precisely locate the shadowed regions, and effectively tackle the problem of luminance variations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intra Coding With Prediction Mode Information Inference

    Publication Year: 2010 , Page(s): 1786 - 1796
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (958 KB) |  | HTML iconHTML  

    In a typical competition-based coding, the pertinence of a prediction mode does not only depend on its own efficiency but also on the fact that it is complementary with the other modes. The method proposed in this paper to improve the intra coding of the H.264/AVC standard relies on this remark; it shows how the cost of signaling predictors that are quite similar can be avoided. Indeed, at low bitrates, the information related to the predictor signaling in intra coding reaches up to 25% of the total bitrate for the whole set of standard VCEG test sequences. In order to reduce this cost, a method reproducible at the decoder side is proposed to eliminate some predictors from the intra predictor set. The proposed method exploits the proximity of the predictors in the transform domain in order to obtain a representative and non-redundant set of predictors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Performance Optical-Flow Architecture Based on a Multi-Scale, Multi-Orientation Phase-Based Model

    Publication Year: 2010 , Page(s): 1797 - 1807
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1228 KB) |  | HTML iconHTML  

    The accurate estimation of optical flow is a problem widely experienced in computer vision and researchers in this field are devoting their efforts to formulate reliable and robust algorithms for real life applications. These approaches need to be evaluated, especially in controlled scenarios. Because of their stability phase-based methods have generally been adopted in the various techniques developed to date, although it is still difficult to be sure of their viability in real-time systems due to their high requirements in terms of computational load. We describe here the implementation of a phase-based optical flow in a field-programmable gate array (FPGA) device. The system benefits from phase-information stability as well as sub-pixel accuracy without requiring additional computations and at the same time achieves high-performance computation by taking full advantage of the parallel processing resources of FPGA devices. Furthermore, the architecture extends the implementation to a multi-resolution and multi-orientation implementation, which enhances its accuracy and covers a wide range of detected velocities. Deep pipelined datapath architecture with superscalar computing units at different stages allows real-time processing beyond VGA image resolution. The final circuit is of significant complexity and useful for a wide range of fields requiring portable optical-flow processing engines. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Large-Scale Concept Detection in Multimedia Data Using Small Training Sets and Cross-Domain Concept Fusion

    Publication Year: 2010 , Page(s): 1808 - 1821
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1631 KB) |  | HTML iconHTML  

    This paper presents the concept detector module developed for the VITALAS multimedia retrieval system. It outlines its architecture and major implementation aspects, including a set of procedures and tools that were used for the development of detectors for more than 500 concepts. The focus is on aspects that increase the system's scalability in terms of the number of concepts: collaborative concept definition and disambiguation, selection of small but sufficient training sets and efficient manual annotation. The proposed architecture uses cross-domain concept fusion to improve effectiveness and reduce the number of samples required for concept detector training. Two criteria are proposed for selecting the best predictors to use for fusion and their effectiveness is experimentally evaluated for 221 concepts on the TRECVID-2005 development set and 132 concepts on a set of images provided by the Belga news agency. In these experiments, cross-domain concept fusion performed better than early fusion for most concepts. Experiments with variable training set sizes also indicate that cross-domain concept fusion is more effective than early fusion when the training set size is small. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Joint Temporal and Spatial Error Concealment for Multiple Description Video Coding

    Publication Year: 2010 , Page(s): 1822 - 1833
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2150 KB) |  | HTML iconHTML  

    Transmission of compressed video signals over error-prone networks exposes the information to losses and errors. To reduce the effects of these losses and errors, this paper presents a joint spatial-temporal estimation method which takes advantages of data correlation in these two domains for better recovery of the lost information. The method is designed for the hybrid multiple description coding which splits video signals along spatial and temporal dimensions. In particular, the proposed method includes fixed and content-adaptive approaches for estimation method selection. The fixed approach selects the estimation method based on description loss cases, while the adaptive approach selects the method according to pixel gradients. The experimental results demonstrate that improved error resilience can be accomplished by the proposed estimation method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Network Coding of Rateless Video in Streaming Overlays

    Publication Year: 2010 , Page(s): 1834 - 1847
    Cited by:  Papers (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1236 KB) |  | HTML iconHTML  

    We present a system for collaborative video streaming in wired overlay networks. We propose a scheme that builds on both rateless codes and network coding in order to improve the system throughput and the video quality at clients. Our hybrid coding algorithm permits to efficiently exploit the available source and path diversity without the need for expensive routing nor scheduling algorithms. We consider specifically an architecture where multiple streaming servers simultaneously deliver video information to a set of clients. The servers apply Raptor coding on the video packets for error resiliency, and the overlay nodes selectively combine the Raptor coded video packets in order to increase the packet diversity in the system. We analyze the performance of selective network coding and describe its application to practical video streaming systems. We further compute an effective source and channel rate allocation in our collaborative streaming system. We estimate the expected symbol diversity at clients with respect to the coding choices. Then we cast a minmax quality optimization problem that is solved by a low-cost bisection based method. The experimental evaluation demonstrates that our system typically outperforms Raptor video streaming systems that do not use network coding as well as systems that perform decoding and encoding in the network nodes. Finally, our solution has a low complexity and only requires small buffers in the network coding nodes, which are certainly two important advantages toward deployment in practical streaming systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Real-Time H.264/AVC Encoder With Complexity-Aware Time Allocation

    Publication Year: 2010 , Page(s): 1848 - 1862
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1433 KB) |  | HTML iconHTML  

    This paper presents a novel processing time control algorithm for a hardware-based H.264/AVC encoder. The encoder employs three complexity scaling methods partial cost evaluation for fractional motion estimation (FME), block size adjustment for FME, and search range adjustment for integer motion estimation (IME). With these methods, 12 complexity levels are defined to support tradeoffs between the processing time and compression efficiency. A speed control algorithm is proposed to select the complexity level that compresses most efficiently among those that meet the target time budget. The time budget is allocated to each macroblock based on the complexity of the macroblock and on the execution time of other macroblocks in the frame. For main profile compression, an additional complexity scaling method called direction filtering is proposed to select the prediction direction of FME by comparing the costs resulting from forward and backward IMEs. With direction filtering in addition to the three complexity scaling methods for baseline compression, 32 complexity levels are defined for main profile compression. Experimental results show that the speed control algorithm guarantees the processing time to meet the given time budget with negligible quality degradation. Various complexity levels for speed control are also used to speed up the encoding time with a slight degradation in quality and a minor reduction of the compression efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Motion Refinement Based Progressive Side-Information Estimation for Wyner-Ziv Video Coding

    Publication Year: 2010 , Page(s): 1863 - 1875
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1367 KB) |  | HTML iconHTML  

    During the past ten years, Wyner-Ziv video coding (WZVC) has gained a lot of research interests because of its unique characteristics of "simple encoding, complex de coding." However, the performance gap between WZVC and conventional video coding has never been closed to the point promised by the information theory. In this paper, we illustrate the chicken-and-egg dilemma encountered in WZVC: high efficiency WZVC requires good estimation of side information (SI); however, good SI estimation is not possible for the decoder without access to the decoded current frame. To resolve such a dilemma, we present and advocate a framework that explores an important concept of decoder-side progressive-learning. More specifically, a decoder-side multi-resolution motion refinement (MRMR) scheme is proposed, where the decoder is able to learn from the already-decoded lower-resolution data to refine the motion estimation (ME), which in turn greatly improves the SI quality as well as the coding efficiency for the higher resolution data. Theoretical analysis shows that at high rates, decoder-side MRMR outperforms motion extrapolation by as much as 5 dB, while falling behind conventional encoder-side inter-frame ME by only about 1.5 dB. In addition, since decoder-side ME does not suffer from the bit-rate overhead in transmitting the motion information, further performance gain can be achieved for decoder-side MRMR by incorporating fractional-pel motion search, block matching with smaller block sizes, and multiple hypothesis prediction. We also present a practical WZVC implementation with MRMR, which shows comparable coding performance as H.264 at very high bit rates. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate Distortion Performance of Pyramid and Subband Motion Compensation Based on Quantization Theory

    Publication Year: 2010 , Page(s): 1876 - 1881
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (580 KB) |  | HTML iconHTML  

    We present in this letter the rate distortion (RD) performance analysis of the inter-layer subband and pyramid motion compensation techniques for spatially scalable video coding. Theoretical performance functions are derived based on RD theory and quantization noise modeling. We assume the base layer is encoded by a non-scalable coder. The coding efficiency of the enhancement (enh.) layer, measured by the signal-to-noise ratio, is determined by the input video power spectral density, the motion estimation error distribution, and the base layer encoder rate. Numerical evaluations of the performance functions show that, compared to independent motion-compensated prediction encoding of the enh. layer, the inter-layer pyramid and subband methods are expected to be more efficient if the base layer is encoded at a sufficiently high quality or the motion estimation accuracy is relatively low in the enh. layer. Results from real video data encoding show that the presented theoretical analysis can be very useful to understand the efficiencies of these spatial scalability techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scene Change Aware Intra-Frame Rate Control for H.264/AVC

    Publication Year: 2010 , Page(s): 1882 - 1886
    Cited by:  Papers (5)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (879 KB) |  | HTML iconHTML  

    Most of rate-control research focuses on inter-coded frames, instead of intra-coded frames which are more possible to cause the problem of buffer overflow. This letter presents a rate control algorithm for intra-frame coding. We propose a Taylor series-based rate-QS model and a scene-change aware rate-QS model to determine quantization parameters for general intra frames and scene-change frames, respectively. Simulation results show that compared to competed approaches, the proposed method achieves better and stable quality with low buffer fullness. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Ambient Illumination as a Context for Video Bit Rate Adaptation Decision Taking

    Publication Year: 2010 , Page(s): 1887 - 1891
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (504 KB) |  | HTML iconHTML  

    In this letter, a new bit rate adaptation decision taking technique is proposed based on the observations that ambient illumination context has an effect on the perceived video quality. Motion activity and structural complexity characteristics of a video content are utilized as the content-related contexts in the proposed technique. Experimental results demonstrate that a significant amount of bit rate can be saved while maintaining the perceptual quality same by adapting the video content according to different ambient illumination conditions using this technique. Subjective assessment results show that the proposed adaptation technique is capable of exploiting the changes in ambient illumination level for bit rate adaptation without sacrificing the perceived visual quality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parametric Interpolation Filter for HD Video Coding

    Publication Year: 2010 , Page(s): 1892 - 1897
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (299 KB) |  | HTML iconHTML  

    Recently, adaptive interpolation filter (AIF) for motion-compensated prediction (MCP) has received increasing attention. This letter studies the existing AIF techniques, and points out that making tradeoff between the two conflicting aspects: the accuracy of coefficients and the size of side information, is the major obstacle to improving the performance of the AIF techniques that code the filter coefficients individually. To overcome this obstacle, parametric interpolation filter (PIF) is proposed for MCP, which represents interpolation filters by a function determined by five parameters instead of by individual coefficients. The function is designed based on the fact that high frequency energies of HD video source are mainly distributed along the vertical and horizontal directions; the parameters are calculated to minimize the energy of prediction error. The experimental results show that PIF outperforms the existing AIF techniques and approaches the efficiency of the optimal filter. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Normalized Co-Occurrence Mutual Information for Facial Pose Detection Inside Videos

    Publication Year: 2010 , Page(s): 1898 - 1902
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (233 KB) |  | HTML iconHTML  

    Human faces captured inside videos are often presented with variable poses, making it difficult to recognize and thus pose detection becomes crucial for such face recognition under non-controlled environment. While existing mutual in formation (MI) primarily considers the relationship between corresponding individual pixels, we propose a normalized co occurrence mutual information in this letter to capture the information embedded not only in corresponding pixel values but also in their geographical locations. In comparison with the existing Mis, the proposed presents an essential advantage that both marginal entropy and joint entropy can be optimally exploited in measuring the similarity between two given images. When developed into a facial pose detection algorithm inside video sequences, we show, through extensive experiments, that such design is capable of achieving the best performances among all the representative existing techniques compared. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it