By Topic

Visual Communications and Image Processing (VCIP), 2012 IEEE

Date 27-30 Nov. 2012

Filter Results

Displaying Results 1 - 25 of 139
  • [Front cover]

    Publication Year: 2012 , Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (69 KB)  
    Freely Available from IEEE
  • Organizing committee

    Publication Year: 2012 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | PDF file iconPDF (65 KB)  
    Freely Available from IEEE
  • Technical program

    Publication Year: 2012 , Page(s): 1 - 38
    Save to Project icon | Request Permissions | PDF file iconPDF (167 KB)  
    Freely Available from IEEE
  • Authors index

    Publication Year: 2012 , Page(s): 1 - 11
    Save to Project icon | Request Permissions | PDF file iconPDF (52 KB)  
    Freely Available from IEEE
  • Improvement of normality and orthogonality in HEVC transform bases

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (245 KB) |  | HTML iconHTML  

    The present paper provides transform bases with improved normality and orthogonality properties based on the integer DCT of high-efficiency video coding (HEVC). The proposed transform bases improve the normality and orthogonality properties compared to the HEVC transform bases. Coding and re-encoding experiments were conducted using HEVC test model (HM) version 6.0 and the proposed method. The experiments were conducted under various bitrate ranges. Under the high-bit-rate condition, the proposed method exhibited coding gains compared to HM 6.0 without increasing the encoding/decoding time. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SEQM: Edge quality assessment based on structural pixel matching

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (302 KB) |  | HTML iconHTML  

    A novel quality metric for binary edge maps, called the structural edge quality metric (SEQM), is proposed in this work. First, we define the matching cost between an edge pixel in a detected edge map and its candidate matching pixel in the ground-truth edge map. The matching cost includes a structural term, as well as a positional term, to measure the discrepancy between the local structures around the two pixels. Then, we determine the optimal matching pairs of pixels using the graph-cut optimization, in which a smoothness term is employed to take into account global edge structures in the matching. Finally, we sum up the matching costs of all edge pixels to determine the quality index of the detected edge map. Simulation results demonstrate that the proposed SEQM provides more faithful and reliable quality indices than conventional metrics. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Gaze-Driven video streaming with saliency-based dual-stream switching

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB) |  | HTML iconHTML  

    The ability of a person to perceive image details falls precipitously with larger angle away from his visual focus. At any given bitrate, perceived visual quality can be improved by employing region-of-interest (ROI) coding, where higher encoding quality is judiciously applied only to regions close to a viewer's focal point. Straight-forward matching of viewer's focal point with ROI coding using a live encoder, however, is computation-intensive. In this paper, we propose a system that supports ROI coding without the need of a live encoder. The system is based on dynamic switching between two pre-encoded streams of the same content: one at high quality (HQ), and the other at mixed quality (MQ), where quality of a spatial region depends on its pre-computed visual saliency values. Distributed source coding (DSC) frames are periodically inserted to facilitate switching. Using a Hidden Markov Model (HMM) to model a viewer's temporal gaze movement, MQ stream is pre-encoded based on ROI coding to minimize the expected streaming rate, while keeping the probability of a viewer observing low quality (LQ) spatial regions below an application-specific ϵ. At stream time, the viewer's gaze locations are collected and transmitted to server for intelligent stream switching. In particular, server employs MQ stream only if: i) viewer's tracked gaze location falls inside the high-saliency regions, and ii) the probability that a viewer's gaze point will soon move outside high-saliency regions, computed using tracked gaze data and updated saliency values, is below ϵ. Experiments showed that video streaming rate can be reduced by up to 44%, and subjective quality is noticeably better than a competing scheme at the same rate where the entire video is encoded using equal quantization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • QoE-aware resource allocation for scalable video transmission over multiuser MIMO-OFDM systems

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (110 KB) |  | HTML iconHTML  

    We investigate in this paper how to maximize multiuser Quality of Experience (QoE) when scalable videos are transmitted over Multiple-Input Multiple-Output (MIMO)-Orthogonal Frequency Division Multiplexing (OFDM) systems. We first study the QoE issues in Scalable Video Coding (SVC) adaptation by constructing a QoE assessment database. We derive the optimal scalability adaptation track for individual video and further summarize common scalability adaptation tracks for grouped videos. A rate-model is developed for SVC adaptation and is employed in designing an efficient resource allocation solution for SVC streaming over multiuser MIMO-OFDM systems. Specifically, time-frequency unit assignment, power allocation, and modulation selection are jointly optimized to maximize users' QoE. Experimental results show that the proposed QoE-aware scalability adaptation scheme significantly outperforms the conventional adaptation schemes, and the proposed QoE-aware resource allocation achieves better QoE performance when compared to existing resource allocation methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementations of a disparity estimation algorithm based on a Proximal splitting method

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (321 KB) |  | HTML iconHTML  

    The Parallel Proximal Algorithm (PPXA+) has been recently introduced as an efficient tool for solving convex optimization problems. It has proved particularly effective in the context of stereo vision, used as the methodological core of a novel disparity estimation technique. In this work, the main methodological issues limiting the efficient parallelization of this technique are addressed, and further modifications are proposed to enable and optimize the design of parallel implementations. Finally, actual implementations that fit both the multi-core CPU and GPU devices are provided and tested to validate the performance potential of the proposed technique. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 3D depth map generation for embedded stereo applications

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1434 KB) |  | HTML iconHTML  

    This paper presents a low complexity 3D image depth map generation algorithm for embedded stereo applications. The proposed algorithm generates depth information based on a single view 2D image automatically. Owing to different scene characteristics of image, we propose a mechanism to classify images to “Scenery”, “Normal” and “Close-up” types first and generate the associated depth map according to the proposed techniques. In addition, we propose a human detection method for strengthening the depth information in images with humans and post-processing for refining depth map. With good quality in the generated depth map, the proposed algorithm achieves about 93% in complexity reduction as compared to the traditional algorithm, which is suitable for realization in both the hardware and embedded systems for portable stereo applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stereoscopic video quality assessment model based on spatial-temporal structural information

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (727 KB) |  | HTML iconHTML  

    Most of the existing 3D video quality assessment methods estimate the quality of each view independently and then pool them into unique objective score. Besides, they seldom take the motion information of adjacent frames into consideration. In this paper, we propose an effective stereoscopic video quality assessment method which focuses on the inter-view correlation of spatial-temporal structural information extracted from adjacent frames. The metric jointly represents and evaluates two views. By selecting salient pixels to be processed and discarding the others, the processing speed is significantly improved. Experimental results on our stereoscopic video database show that the proposed algorithm correlates well with subjective scores. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An adaptive hybrid CDN/P2P solution for Content Delivery Networks

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (292 KB) |  | HTML iconHTML  

    Streaming services have grown rapidly in the last few years and providers of video on-demand, such as Netflix or YouTube, are increasing the number of users even more quickly. The majority of these companies implement their services using huge Content Delivery Networks that are as much powerful as expensive, e.g. Amazon and Akamai. In this paper we propose a hybrid CDN/P2P solution that aims at reducing the infrastructural costs exploiting local caching and P2P while guaranteeing an optimal quality of service. The proposed architecture uses a classic CDN complemented by a geographically distributed layer where P2P can be activated exploiting network, content awareness and locality. The performance of the proposed solution is evaluated by means of a prototype implementation that has been deployed using the PlanetLab network and the Amazon AWS cloud services. Our findings show that the proposed approach provides adaptive, flexible, scalable and content centric service to the end users while significantly reducing the infrastructural costs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing JPEG quantization table for low bit rate mobile visual search

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (363 KB) |  | HTML iconHTML  

    Smart phones is bringing about emerging potentials in mobile visual search. Extensive research efforts have been made in compact visual descriptors. However, directly extracting visual descriptors on a mobile device is computationally intensive and time consuming. Towards low bit rate visual search, we propose to deeply compress query images by learning a customized JPEG quantization table in the context of visual search. Distinct from traditional image compression, by incorporating pair-wise image matching precision into distortion measure, we optimize quantization table to seek a better trade-off between image compression rate and visual search performance. An evolutionary algorithm is employed to learn an optimal quantization table. Under MPEG CDVS evaluation framework, extensive evaluation has been done including image retrieval and pair-wise matching over 1 million database images. Experimental results have demonstrated that our optimized quantization table works much better than JPEG default one in terms of retrieval/matching performance vs. a set of different operating points. The proposed low bit rate solution may be easily deployed to smart phones without hardware support, as a useful complement to the ongoing MPEG CDVS standardization efforts. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Gradient-based fast decision for intra prediction in HEVC

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (319 KB) |  | HTML iconHTML  

    As the next generation standard of video coding, the High Efficiency Video Coding(HEVC) achieves significantly better coding efficiency than all existing video coding standards, which is however at the cost of a much higher computation complexity. To address this issue, this paper presents a gradient-based fast decision algorithm for intra prediction in HEVC. More specifically, the intra prediction in HEVC is divided into two stages: prediction unit(PU) size decision and mode decision. At the PU size decision process, four orientation features are extracted from the coding unit by the intensity gradient filters to decide the texture complexity and texture direction of the coding unit, and then the texture direction is used to exclude impossible prediction modes at the mode decision process. Compared to HEVC reference software, the proposed algorithm saves around 56.7% of the encoding time in intra high efficiency setting and up to 70.86% in intra low complexity setting with slight performance degradation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A control-theoretic approach to rate adaptation for dynamic HTTP streaming

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (450 KB) |  | HTML iconHTML  

    Recently, dynamic adaptive HTTP streaming has been widely used for video content delivery over Internet. However, it is still a challenge how to switch video bitrate under time-varying bandwidth. In this paper, we propose a novel control-theoretic approach to adapt video segments in dynamic HTTP streaming. The rate control is based on a sink-buffer, which has an overflow-threshold and an underflow-threshold. The objective is to maximize the playback quality while keeping the receiver buffer from either overflow or underflow. Using control theory, we formulate this rate control scheme as a proportional (P) control system, which exists oscillations and steady-errors. Furthermore, we design a proportional derivative (PD) controller to improve its adaptation performance. The conditions for stability and settling time of the PD controller are also derived. Numerous experiment results demonstrate the effectiveness of our proposed PD control scheme for dynamic HTTP streaming. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive search range algorithm based on Cauchy distribution

    Publication Year: 2012 , Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (261 KB) |  | HTML iconHTML  

    In video coding standard, motion estimation (ME) always plays an important role in reducing temporal redundancies at the expense of higher computational complexity. Many fast ME algorithms have been proposed to reduce the coding complexity. Some papers focus on applying specific search patterns to reduce the search points within a fixed search range (SR). But there are only a few of them trying to reduce the size of SR. In this paper, an adaptive SR algorithm is presented. Cauchy distribution is used to model the SR for one frame and the information of motion vector differences in the neighboring blocks is used to adjust the SR for a particular block. Experimental results show that the proposed algorithm can reduce the size of SR significantly with negligible quality degradation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiview super-resolution using high-frequency synthesis in case of low-framerate depth information

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1194 KB) |  | HTML iconHTML  

    Increasing the image sharpness of low-resolution views is a key issue in the multiview image and video processing domain. Thereby, a low-resolution view gets refined by high-frequency content that can either be obtained from temporally or spatially adjacent highresolution reference images. We propose a refined super-resolution algorithm for multiview images that is robust to the usage of temporally highly misaligned depth maps. The temporal misalignment may be caused either by a temporal subsampling or a lower framerate of the depth camera with respect to the image cameras. Our refinement step is based on a blockwise low-frequency registration in order to efficiently adapt the high-frequency content of the highresolution reference to the low-resolution destination view. The simulation results show that our proposed algorithm leads to a peak PSNR gain of up to 1.21 dB with respect to a comparable unrefined super-resolution approach. On average, our approach outperforms the unrefined super-resolution algorithm by 0.61 dB. For all considered scenarios the improvement of visual quality is also convincingly. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • WaveCast: Wavelet based wireless video broadcast using lossy transmission

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (234 KB) |  | HTML iconHTML  

    Wireless video broadcasting is a popular application of mobile network. However, the traditional approaches have limited supports to the accommodation of users with diverse channel conditions. The newly emerged Softcast approach provides smooth multicast performance but is not very efficient in inter frame compression. In this work, we propose a new video multicast approach: WaveCast. Different from softcast, WaveCast utilizes motion compensated temporal filter (MCTF) to exploit inter frame redundancy, and utilizes conventional framework to transmit motion information such that the MVs can be reconstructed losslessly. Meanwhile, WaveCast transmits the transform coefficients in lossy mode and performs gracefully in multicast. In experiments, Wave-Cast outperforms softcast 2dB in video PSNR at low channel SNR, and outperforms H.264 based framework up to 8dB in broadcast. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intra coding for depth maps using adaptive boundary location

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (315 KB) |  | HTML iconHTML  

    Depth maps, an essential part in the new generation of 3D video coding, allow rendering of arbitrary viewpoints of a video scene. Depth maps are characterized by sharp object boundaries, which significantly affect the rendering quality and account for the most bitrate for depth map coding. This paper proposes a novel intra coding method for depth maps based on a two-step adaptive boundary location process. By extracting a series of sub-blocks along a depth boundary and refining the boundary within sub-blocks, accurate predictions for blocks with arbitrary edge shapes can be realized. Experimental results show that the proposed scheme achieves bitrate reductions of up to 28% and 13% on average for seven test sequences of MPEG 3DV compared to original intra coding of H.264/AVC considering the same quality of synthesized views. Besides, subjective quality of virtual views is improved owning to well preserved boundary information. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel in-loop filter based on CLDT masking effect model for HEVC

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (517 KB) |  | HTML iconHTML  

    HEVC is the current joint video coding standardization project of ITU-T Video Coding Experts Group and Moving Picture Experts Group. It includes deblocking filter (DF) and adaptive loop filter (ALF) after the deblocking filter. To improve the accuracy, the deblocking filter is modified with the parameter information of intra coded blocks in this paper. Considering that the masking effect of human visual system (HVS) will eliminate the visibility of the blocking artifacts, a combined luminance and directional-texture (CLDT) masking effect model-based DF is proposed. At last, a novel Wiener-based in-loop filter is proposed, which is applied to eliminate the quantization error using both pre-DF signal and post-DF signal. Experimental results show efficiency improvement and subjective quality improvement compared with HEVC test model 2 (HM2.0) anchor. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical mode decision scheme for fast implementation of spatially scalable video coding

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (176 KB) |  | HTML iconHTML  

    In order to improve coding efficiency, a new inter-layer prediction mechanism is incorporated in the SVC extension of H.264/AVC. This utilizes information from the base layer to inform the process of coding the enhancement layer. However this increases the computational requirement. A fast mode decision algorithm that exploits the correlation of a macroblock in the enhancement layer and both the corresponding macroblock in the base layer and neighbouring macroblocks, is proposed. The algorithm also assesses the homogeneity of the picture content and uses the mode information of the base layer to make faster mode selections in the enhancement layer. The fact that larger partition sizes are more suitable for homogeneous regions, and vice versa, is also exploited. Empirical evaluation of the proposed algorithm shows that, for similar rate distortion performance, encoding time is reduced by up to 84% compared with the JSVM9.18 software implementation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simplified AMVP for High Efficiency Video Coding

    Publication Year: 2012 , Page(s): 1 - 4
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (196 KB) |  | HTML iconHTML  

    In High Efficiency Video Coding (HEVC), advanced motion vector prediction (AMVP) is adopted to predict current motion vector by utilizing a competition-based scheme from a given candidate set, which include both the spatial and temporal motion vectors. In order to enhance the practicability of the AMVP, a simplified AMVP is proposed. Firstly, by analyzing the importance of the spatial and temporal candidates, we reduce the number of the candidates involved in the competition set and simplify the redundancy checking process, which will decrease the complexity of the decoder as well as improve the robustness of the decoder. Secondly, we simplify the zero motion adding process which will occur only when the number of existing candidates is less than the predefined number. Experimental results show that the proposed scheme provides no loss in random access and low delay conditions. These two simplifications have been proposed and adopted into the HEVC standard. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-view video streaming over wireless networks with RD-optimized scheduling of network coded packets

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (143 KB) |  | HTML iconHTML  

    Multi-view video streaming is an emerging video paradigm that enables new interactive services, such as free viewpoint television and immersive teleconferencing. However, it comes with a high bandwidth cost, as the equivalent of many single-view streams has to be transmitted. Network coding (NC) can improve the performance of the network by allowing nodes to combine received packets before retransmission. Several works have shown NC to be beneficiai in wireless networks, but the delay introduced by buffering before decoding raises a problem in real-time streaming applications. Here, we propose to use Expanding Window NC (EWNC) for multi-view streaming to allow immediate decoding of the received packets. The order in which the packets are included in the coding window is chosen via RD-optimization for the current sending opportunity. Results show that our approach consistently outperforms both classical NC applied on each view independently and transmission without NC. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-quality image interpolation via local autoregressive and nonlocal 3-D sparse regularization

    Publication Year: 2012 , Page(s): 1 - 6
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1115 KB) |  | HTML iconHTML  

    In this paper, we propose a novel image interpolation algorithm, which is formulated via combining both the local autoregressive (AR) model and the nonlocal adaptive 3-D sparse model as regularized constraints under the regularization framework. Estimating the high-resolution image by the local AR regularization is different from these conventional AR models, which weighted calculates the interpolation coefficients without considering the rough structural similarity between the low-resolution (LR) and high-resolution (HR) images. Then the nonlocal adaptive 3-D sparse model is formulated to regularize the interpolated HR image, which provides a way to modify these pixels with the problem of numerical stability caused by AR model. In addition, a new Split-Bregman based iterative algorithm is developed to solve the above optimization problem iteratively. Experiment results demonstrate that the proposed algorithm achieves significant performance improvements over the traditional algorithms in terms of both objective quality and visual perception. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intra mode coding in HEVC standard

    Publication Year: 2012 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (194 KB) |  | HTML iconHTML  

    New High Efficiency Video Coding (HEVC) standard is designed to provide substantial coding efficiency improvement compared to H.264/AVC. Latest subjective testing shows 50% improvement has been achieved. Many new technologies contribute to the overall improvement. Intra prediction with 35 modes is one of the key improvements. Associated with that, there is a new intra mode coding method to efficiently signal the selected modes. This paper presents this new intra mode coding method that has been adopted by HEVC. In this method, the 35 intra modes are divided into two categories. One category includes 3 most probable modes (MPMs) and another category includes 32 remaining modes. In doing so, shorter codeword is used for coding MPMs and fixed length coding is used to code the remaining modes. Experimental results show the 3MPMs based method improve the coding efficiency compared to the prior art method used in H.264/AVC. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.