System Maintenance:
There may be intermittent impact on performance while updates are in progress. We apologize for the inconvenience.
By Topic

Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on

Date June 28 2009-July 3 2009

Filter Results

Displaying Results 1 - 25 of 473
  • [Front cover]

    Publication Year: 2009 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (519 KB)  
    Freely Available from IEEE
  • [Title page]

    Publication Year: 2009 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (437 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2009 , Page(s): ii
    Save to Project icon | Request Permissions | PDF file iconPDF (497 KB)  
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2009 , Page(s): iii - xii
    Save to Project icon | Request Permissions | PDF file iconPDF (444 KB)  
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2009 , Page(s): xiii - lvi
    Save to Project icon | Request Permissions | PDF file iconPDF (549 KB)  
    Freely Available from IEEE
  • Directional filtering transform

    Publication Year: 2009 , Page(s): 1 - 4
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (256 KB) |  | HTML iconHTML  

    This paper proposes the directional filtering transform (dFT, in order to distinguish from the common usage on DFT) to better exploit intra-frame correlation in H.264 intra-frame coding. It consists of a directional filtering and an optional DCT transform. In the proposed directional filtering, there are two different approaches. One is the uni-directional filtering (UDF) that is similar to H.264 directional intra prediction. In this approach, only samples from neighboring blocks can be used in prediction. Another is bidirectional filtering (BDF) that exploits the correlations among samples from not only neighboring blocks but also the current block. The prediction structure in this approach is hierarchical multi-layer. In this paper, we present mathematical analyses on UDF and BDF and show the advantage to combine them together. The proposed dFT is integrated into H.264 intra-frame coding too. The preliminary experimental results in H.264 demonstrate its superiority. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiview video coding using projective rectification-based view extrapolation and synthesis bias correction

    Publication Year: 2009 , Page(s): 5 - 8
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (446 KB) |  | HTML iconHTML  

    Current view synthesis prediction (VSP) techniques for multiview video coding (MVC) rely on disparity-based view interpolation or depth-based 3D warping. The former cannot be applied to every camera view, whereas the latter may require coding of the depth information of a scene. To avoid these constraints, we propose an improved VSP-based MVC scheme based on the following three techniques: 1) view extrapolation, which allows VSP to be applicable to almost all camera views, 2) projective rectification, which improves the synthesis quality when neighboring camera planes are not parallel, and 3) synthesis bias correction, which uses the past synthesis biases to improve the synthesis quality of the current frame. Experimental results demonstrate that our scheme offers PSNR gains of up to 1.6 dB compared to the current MVC standard. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Single-iteration full-search fractional motion estimation for quad full HD H.264/AVC encoding

    Publication Year: 2009 , Page(s): 9 - 12
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (508 KB) |  | HTML iconHTML  

    Fractional motion estimation (FME) is widely used in video compression standards. In H.264/AVC, the precision of motion vector is down to quarter pixels to improve the coding efficiency. However, FME occupies over 45% of the computation complexity in an H.264 encoder and this high complexity limits the processing capability. In this paper, a single-iteration full search FME is proposed. By the algorithm and architecture co-optimization, the bandwidth to the frame buffer is reduced by 31%. Furthermore, 82% of circuit area for the Hadamard transformation and subtraction are saved from the direct implementation. Compared with prior arts, the proposed design supports 3.39 times higher throughput with only 0.02 dB PSNR drop. Thus, the specification of 4096 times 2160 quad full high definition H.264/AVC FME processing can be achieved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate-distortion analysis of rectification-based view interpolation for multiview video coding

    Publication Year: 2009 , Page(s): 13 - 16
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (276 KB) |  | HTML iconHTML  

    View interpolation has been applied in multiview view coding. However, existing schemes assume all cameras are aligned. These methods may not perform well when neighboring cameras point to different directions. In this paper, we apply the rectification based view interpolation to MVC. We first derive the theoretical performance gain of the rectification based view interpolation over existing interpolation method. We then analyze the rate-distortion performance of the proposed MVC method, and compare with the disparity compensation based MVC and view interpolation based MVC without rectification. The analyses show that view rectification can offer additional coding gain over existing view interpolation based MVC, especially when the disparity estimation is accurate and when the neighboring cameras are close to each others. Preliminary experimental results are provided to verify the theoretical analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast mode selection scheme for H.264/AVC inter prediction based on statistical learning method

    Publication Year: 2009 , Page(s): 17 - 20
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (242 KB) |  | HTML iconHTML  

    H.264 adopts variable block size motion estimation and rate-distortion-optimization based mode decision to improve video quality and compression ratio. These techniques have made H.264 better than other existing video coding standards. However, they are computationally intensive and time-consuming. In this paper, a fast mode selection scheme is proposed for H.264 inter prediction. Firstly, the first few frames are encoded and thresholds are acquired through a statistical learning process. Then, for the rest of frames, motion estimation and mode decision are only performed for the candidate modes which are selected with the proposed fast mode selection scheme. The proposed approach is applicable to all existing motion search algorithms. Besides, thresholds are on-line computed separately for each sequence. Results show that the total encoding time is saved by 57.2% on average with negligible video quality degradation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new strategy to predict the search range in H.264/AVC

    Publication Year: 2009 , Page(s): 21 - 24
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (171 KB) |  | HTML iconHTML  

    Motion estimation is a very time consuming part in H.264 codec. In order to reduce motion estimation time, many strategies have been used during this process. Dynamic search range is one of them. In this paper, based on the analysis of the problems existed in current algorithm, we propose a new strategy to predict motion estimation search range by using the information of image size, block mode and QP. Experiment results show that the proposed algorithm can reduce up to 16.94% of the motion estimation time and save 9.15% of the total encoding time only with 0.001 dB PSNR loss and 0.51% BitRate increment comparing with existing DSR algorithm in UMHexagon Search algorithm. In addition, this algorithm is very easy to combine with other Motion Estimation algorithms in JM. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Content-based hierarchical motion description for multiple video adaptation

    Publication Year: 2009 , Page(s): 25 - 28
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (310 KB) |  | HTML iconHTML  

    Video adaptation has been considered as a promising technique to tackle challenging problems in pervasive multimedia applications. However, the styles of video representation and description in existing framework are not flexible enough to adapt diversified application environment. In this paper, we propose a novel solution based on intermediate description, which can support fast multiple video adaptation operations in signal level (e.g., temporal resolution reduction, bit-rate adaptation), structural level (e.g., random access of any shots, fast preview of any key-frames or thumbnails), as well as joint level. Experimental results show that the presented solution can support the operations in real-time environment, while maintaining coding efficiency, which demonstrates its feasibility and effectiveness. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Spatial transcoding from Scalable Video Coding to H.264/AVC

    Publication Year: 2009 , Page(s): 29 - 32
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (238 KB) |  | HTML iconHTML  

    Scalable Video Coding (SVC) is backwards compatible to H.264/AVC in the sense that the base layer sub-bitstream is decodable by an H.264/AVC decoder. However, there are applications wherein it is desirable for an H.264/AVC decoder to obtain a higher resolution video representation than the base layer within SVC. In order to fulfill the needs of such application scenarios, transcoding of SVC enhancement layers to H.264/AVC is required. This paper presents a transcoding scheme that is capable of transcoding a spatial scalable SVC bitstreams to H.264/AVC bitstreams that provide high resolution than the H.264/AVC compliant base layer. To reduce the complexity at the transcoder, a fast mode decision (MD) process is proposed, wherein the original SVC macroblock coding modes and motion information are reused as much as possible. Experimental results show that proposed scheme performs elegantly compared with full-decoding-and-encoding transcoding with low computational complexity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Block-Matching Translation and Zoom Motion-Compensated Prediction

    Publication Year: 2009 , Page(s): 33 - 36
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (223 KB) |  | HTML iconHTML  

    In modern video coding standards, motion compensated prediction (MCP) plays a key role to achieve video compression efficiency. Most of them make use of block matching techniques and assume the motions are pure translational. Attempts toward a more general motion model are usually too complex to be practical in near future. In this paper, a new Block-Matching Translation and Zoom Motion-Compensated Prediction (BTZMP) is proposed to extend the pure translational model to a more general model with zooming. It adopts the camera zooming and object motions that becomes zooming while projected on video frames. Experimental results show that BTZMP can give prediction gain up to 2.25dB for various sequences compared to conventional block-matching MCP. BTZMP can also be incorporated with multiple reference frames technique to give extra improvement, evidentially by the prediction gain ranging from 2.03 to 3.68dB in the empirical simulations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Content aware configurable architecture for H.264/AVC integer motion estimation engine

    Publication Year: 2009 , Page(s): 37 - 40
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (154 KB) |  | HTML iconHTML  

    In this paper, we contribute a configurable SAD tree architecture based on adaptive subsampling scheme. Firstly, by further exploiting the spatial feature, the integer motion estimation process is greatly sped up. Secondly, the conventional partial sum of absolute difference (SAD) based pipeline structure is optimized into configurable SAD oriented way, which enhances the performance and solve the data reuse problem caused by adaptive scheme in the architecture level. Moreover, a cross reuse and compressor tree based circuit level optimization is introduced and 6.56% hardware cost is reduced. Experiments show that our design can averagely achieve 42.23% saving in processing cycles compared with previous design. With 323 k gates at about 144.8 MHz, our design can achieve real-time encoding of HDTV 1088 p@30 fps. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accurate bit prediction for intra-only rate control

    Publication Year: 2009 , Page(s): 41 - 44
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (394 KB) |  | HTML iconHTML  

    Rate control plays a crucial role for video communication applications. It ensures that the generated compressed bit streams satisfy bandwidth and buffer constraints. Rate control algorithms recommended by H.264/AVC adopt rate-distortion (R-D) models for inter-frames to determine quantization parameters (QPs) but not for intra-frames. Instead, they directly compute QPs without any considerations of bitrates and coding complexities for intra-frames. In order to obtain more accurate target bit prediction for intra-frames, we first introduce the geometry gradient information as a new complexity measure to accurately represent the complexities for intra-frames. Then, we propose a novel R-D model which is an integration of a linear rate-complexity model and an exponential rate-quantization model. Finally, we develop an accurate and robust intra-only rate control algorithm for H.264/AVC. Experimental results demonstrate that, compared with JVT-W042, the proposed algorithm achieves higher precise bit estimation, provides more robust buffer control, and also improves coding quality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Frame complexity prediction for H.264/AVC rate control

    Publication Year: 2009 , Page(s): 45 - 48
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (462 KB) |  | HTML iconHTML  

    Rate control regulates the output bit rate of a video encoder in order to obtain optimum visual quality within the available network bandwidth and to maintain buffer fullness within a specified tolerance range. In this paper, we propose a novel rate control scheme for H.264/AVC video compression with a number of new features. We first introduce a calculation approach of frame complexity based on the linear prediction theory. Then, we propose a joint rate-distortion model which is an integration of a liner rate-complexity model and an exponential rate-quantization model. Finally, we develop an effective target bit estimation approach. Experimental results show that, compared with JVT-W042, our scheme achieves more accurate rate regulation, provides robust buffer control, efficiently reduces frame skipping, and improves visual quality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fractional compensation for spatial scalable video coding

    Publication Year: 2009 , Page(s): 49 - 52
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (415 KB) |  | HTML iconHTML  

    This paper proposes a novel fractional compensation approach for spatial scalable video coding. It simultaneously exploits inter layer correlation and intra layer correlation by learning-based mapping. Instead of using an enhancement layer reconstruction as an entire reference, a set of reference pairs are generated from high-frequency components of both base layer and enhancement layer reconstructions at previous frame. The reference set, which consists of low-resolution and high-resolution patches, can be generated in both encoder and decoder by on-line learning. During the encoding of enhancement layer, a prediction is first gotten from base layer, from which low-resolution patches are extracted. These patches are then used as indices to find the matched high-resolution patches from the reference set. Finally, the prediction enhanced by the high-resolution patches is used for coding. The proposed approach does not need any motion bits. With our proposed FC approach, the performance of H.264 SVC can be improved up to 2.4 dB in spatial scalable coding. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating spatial cues for audio coding in MDCT domain

    Publication Year: 2009 , Page(s): 53 - 56
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (220 KB) |  | HTML iconHTML  

    Although widely used otherwise, MDCT is excluded in the current scheme for spatial cues representation, due to its lacking of phase information and energy conservation. But combining MDCT with MDST overcomes the difficulties. Moreover, MDST spectra can be built perfectly from neighboring MDCT spectra. The MDCT-MDST conversion, in matrix form, is approximating to a banded sparse matrix. When applied to spatial audio coding using MDCT based core coders, this method avoids separate transforming for cues representation and saves significant computation. Listening tests also show that it has same audio quality as other complex transform based methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video coding based on audio-visual attention

    Publication Year: 2009 , Page(s): 57 - 60
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (201 KB) |  | HTML iconHTML  

    This paper proposes an efficient video coding method based on audio-visual attention, which is motivated by the fact that cross-modal interaction significantly affects humans' perception of multimedia content. First, we propose an audio-visual source localization method to locate the sound source in a video sequence. Then, its result is used for applying spatial blurring to video frames in order to reduce redundant high-frequency information and achieve coding efficiency. We demonstrate the effectiveness of the proposed method for H.264/AVC coding along with the results of a subjective evaluation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast multi-reference motion estimation via statistical learning for H.264/AVC

    Publication Year: 2009 , Page(s): 61 - 64
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (205 KB) |  | HTML iconHTML  

    In the H.264/AVC coding standard, motion estimation (ME) is allowed to use multiple reference frames to make full use of reducing temporal redundancy in a video sequence. Although it can further reduce the motion compensation errors, it introduces tremendous computational complexity as well. In this paper, we propose a statistical learning approach to reduce the computation involved in the multireference motion estimation. Some representative features are extracted in advance to build a learning model. Then, an off-line pre-classification approach is used to determine the best reference frame number according to the run-time features. It turns out that motion estimation will be performed only on the necessary reference frames based on the learning model. Experimental results show that the computation complexity is about three times faster than the conventional fast ME algorithm while the video quality degradation is negligible. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Block-based color correction algorithm for multi-view video coding

    Publication Year: 2009 , Page(s): 65 - 68
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (441 KB) |  | HTML iconHTML  

    The color variations among different viewpoints in multiview video sequences may deteriorate the visual quality and coding efficiency. Various color correction methods have been proposed, however, the color appearance and histogram of corrected target frames are not similar enough to the reference frames in details. Focusing on restoring more similar color, a block-based color correction algorithm is proposed. The blocks in reference frames are matched into target frames through spatial prediction, and the colorization scheme is then adopted to expand color as a coarse correction. Finally the mixture with global color transfer result yields the fine correction. The experiment results show this novel method can provide better visual effect in detail and also provide the corrected frames with histograms more similar to reference histograms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Multi-layer motion estimation scheme for spatial scalability in H.264/AVC scalable extension

    Publication Year: 2009 , Page(s): 69 - 72
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (398 KB) |  | HTML iconHTML  

    In this paper, we propose a fast multi-layer motion estimation algorithm for spatial scalability provided in H.264/AVC scalable extension, based on the reuse of the motion vectors from multiple spatial layers. The reused motion vector is used to set a search center and refined within a small search area. However, the reused motion vector often produces significant prediction error at object boundaries. Motion vector difference defined in the H.264/AVC standard is used to decide whether the reused motion vector is appropriate. In addition, a search range is dynamically adjusted based on the distribution of the rate-distortion cost. By using the proposed multi-layer motion estimation, we reduce the execution time of motion estimation by almost 93% at the cost of 0.01 dB PSNR decrease and 0.79% bit-rate increase. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Kurtosis-based super-resolution algorithm

    Publication Year: 2009 , Page(s): 73 - 76
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1095 KB) |  | HTML iconHTML  

    A kurtosis-based super-resolution image reconstruction algorithm is proposed in this paper. Firstly, we give the definition of the kurtosis image and analyze its two properties: (i) the kurtosis image is Gaussian noise invariant, and (ii) the absolute value of a kurtosis image becomes smaller as the the image gets smoother. Then we build a constrained absolute local kurtosis maximization function to estimate the high-resolution image by fusing multiple blurred low-resolution images corrupted by intensive white Gaussian noise. The Lagrange multiplier is used to solve the combinatorial optimization problem. Experimental results demonstrate that the proposed method is better than the conventional algorithms in terms of visual inspection and robustness, using both synthetic and real world examples under severe noise background. It has an improvement of 0.5 to 2.0 dB in PSNR over other approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A robust spatial-temporal line-warping based deinterlacing method

    Publication Year: 2009 , Page(s): 77 - 80
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (129 KB) |  | HTML iconHTML  

    In this paper, a line-warping based deinterlacing method will be introduced. The missing pixels in interlaced videos can be derived from the warping of pixels in horizontal line pairs. In order to increase the accuracy of temporal prediction, multiple temporal-line pairs, selected according to constant velocity model, are used for warping. The stationary pixels can be well-preserved by accuracy stationary detection. A soft switching between spatial-temporal interpolated values and temporal average is introduced in order to prevent unstable switching. Owing to above novelties, the proposed method can yield higher visual quality deinterlaced videos than conventional methods. Moreover, this method can suppress most deinterlaced visual artifacts, such as line-crawling, flickering and ghost-shadow. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.