By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 4 • Date Apr 2001

Filter Results

Displaying Results 1 - 10 of 10
  • Content-based video parsing and indexing based on audio-visual interaction

    Page(s): 522 - 535
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (372 KB)  

    A content-based video parsing and indexing method is presented in this paper, which analyzes both information sources (auditory and visual) and accounts for their inter-relations and synergy to extract high-level semantic information. Both frame- and object-based access to the visual information is employed. The aim of the method is to extract semantically meaningful video scenes and assign semantic label(s) to them. Due to the temporal nature of video, time has to be accounted for. Thus, time-constrained video representations and indices are generated. The current approach searches for specific types of content information relevant to the presence or absence of speakers or persons. Audio-source parsing and indexing leads to the extraction of a speaker label mapping of the source over time. Video-source parsing and indexing results in the extraction of a talking-face shot mapping over time. Integration of the audio and visual mappings constrained by interaction rules leads to higher levels of video abstraction and even partial detection of its context View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient architecture for two-dimensional discrete wavelet transform

    Page(s): 536 - 545
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (644 KB)  

    This paper proposes an efficient architecture for the two-dimensional discrete wavelet transform (2-D DWT). The proposed architecture includes a transform module, a RAM module, and a multiplexer. In the transform module, we employ the polyphase decomposition technique and the coefficient folding technique to the decimation filters of stages 1 and 2, respectively. In comparison with other 2-D DWT architectures, the advantages of the proposed architecture are 100% hardware utilization, fast computing time (0.5-0.67 times of the parallel filters'), regular data flow, and low control complexity, making this architecture suitable for next generation image compression systems, e.g., JPEG-2000 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A generic postprocessing technique for image compression

    Page(s): 546 - 553
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (528 KB)  

    A postprocessing technique is developed for image quality enhancement. In this method, a distortion-recovery model extracts multiresolution edge features from the decompressed image and uses these visual features as input to estimate the difference image between the original uncompressed image and the decompressed image. Coding distortions are compensated by adding the model output to the decompressed image. Unlike many existing postprocessing methods, which smooth blocking artifacts and are designed specifically for transform coding or vector quantization, the proposed technique is generic and can be applied to all of the main coding methods. Experimental results involving postprocessing four coding systems show that the proposed technique achieves significant improvements on the quality of reconstructed images, both in terms of the objective distortion measure and subjective visual assessment View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Buffer management and dimensioning for a pull-based parallel video server

    Page(s): 485 - 496
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (220 KB)  

    There has been a trend toward designing video-on-demand systems using parallel-server architectures. By exploiting server-level parallelism, researchers can break through the performance limit of a single server, while keeping the system cost low by leveraging on commodity hardware platforms. A number of studies have demonstrated the feasibility of building parallel video servers around the client-pull architecture and one can even incorporate data redundancy into the system to sustain server-level failures. However, due to randomness of request arrivals and server processing time, dimensioning the server resource requirement is often difficult. This paper tackles the problem of buffer management and dimensioning for such a pull-based parallel video server. Using a generic buffer-pool model with worst-case analysis, upper bounds on the server buffer requirement are derived for a parallel-server design with multiple disks per server. The obtained bounds are independent of placement policy, video bit-rate, disk-scheduling discipline, and even number of servers in the system, making it applicable to a wide range of server designs. The analytical results also proved that the scalability of this pull-based server design will not be limited by the server buffer requirement View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On optimal entropy-constrained deadzone quantization

    Page(s): 560 - 563
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (84 KB)  

    This paper studies entropy-constrained deadzone quantizer design. It first shows that for symmetric uni-modal distributions, the entropy of the quantized output is a monotonic decreasing function of the outer bin-width of the deadzone quantizer. Based on this result, a fast algorithm is introduced for optimal entropy-constrained deadzone quantizer design. As experimental results, we give the optimal zero bin-width and outer bin-width for several generalized Gaussian distributions that are often used to model the AC coefficient distributions in image and video transform coding View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate control for low-bit-rate video via variable-encoding frame rates

    Page(s): 512 - 521
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (204 KB)  

    A novel rate control algorithm with a variable-encoding frame rate method is proposed for low-bit-rate video coding. Most existing rate control algorithms for low-bit-rate video focus on bit allocation at the macroblock level under a constant frame-rate assumption. The proposed rate control algorithm is able to adjust the encoding frame rate at the expense of tolerable time-delay. The new rate control algorithm attempts to achieve a good balance between spatial quality and temporal quality to enhance the overall human perceptual quality at low-bit-rates. It is demonstrated that the rate control algorithm achieves higher coding efficiency at low-bit-rates, with a low additional computational cost. The proposed variable-encoding frame rate method is compatible with the bit-stream structure of H.263+ View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An advanced contrast enhancement using partially overlapped sub-block histogram equalization

    Page(s): 475 - 484
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1188 KB)  

    An advanced histogram-equalization algorithm for contrast enhancement is presented. Histogram equalization is the most popular algorithm for contrast enhancement due to its effectiveness and simplicity. It can be classified into two branches according to the transformation function used: global or local. Global histogram equalization is simple and fast, but its contrast-enhancement power is relatively low. Local histogram equalization, on the other hand, can enhance overall contrast more effectively, but the complexity of computation required is very high due to its fully overlapped sub-blocks. In this paper, a low-pass filter-type mask is used to get a nonoverlapped sub-block histogram-equalization function to produce the high contrast associated with local histogram equalization but with the simplicity of global histogram equalization. This mask also eliminates the blocking effect of nonoverlapped sub-block histogram-equalization. The low-pass filter-type mask is realized by partially overlapped sub-block histogram-equalization (POSHE). With the proposed method, since the sub-blocks are much less overlapped, the computation overhead is reduced by a factor of about 100 compared to that of local histogram equalization while still achieving high contrast. The proposed algorithm can be used for commercial purposes where high efficiency is required, such as camcorders, closed-circuit cameras, etc View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast scheme for image size change in the compressed domain

    Page(s): 461 - 474
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB)  

    Given a video frame in terms of its 8×8 block-DCT coefficients, we wish to obtain a downsized or upsized version of this frame also in terms of 8×8 block-DCT coefficients. The DCT being a linear unitary transform is distributive over matrix multiplication. This fact has been used for downsampling video frames in the DCT domain. However, this involves matrix multiplication with the DCT of the downsampling matrix. This multiplication can be costly enough to trade off any gains obtained by operating directly in the compressed domain. We propose an algorithm for downsampling and also upsampling in the compressed domain which is computationally much faster, produces visually sharper images, and gives significant improvements in PSNR (typically 4-dB better compared to bilinear interpolation). Specifically the downsampling method requires 1.25 multiplications and 1.25 additions per pixel of original image compared to 4.00 multiplications and 4.75 additions required by the method of Chang et al. (1995). Moreover, the downsampling and upsampling schemes combined together preserve all the low-frequency DCT coefficients of the original image. This implies tremendous savings for coding the difference between the original frame (unsampled image) and its prediction (the upsampled image). This is desirable for many applications based on scalable encoding of video. The method presented can also be used with transforms other than DCT, such as Hadamard or Fourier View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing motion-vector accuracy in block-based video coding

    Page(s): 497 - 511
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (320 KB)  

    All motion-vectors are encoded with the same tired accuracy, typically 1/2-pixel accuracy, but the best motion-vector accuracies are not known. We present a theoretical framework to find the motion-vector accuracies that minimize the total encoding rate with this type of coder, for the classical case where all motion-vectors are encoded with the same accuracy, and for new cases where the accuracy is adapted on a frame-by-frame or block-by-block basis. To do this, we analytically model the effect of motion-vector accuracy and show that the energy in a block of the difference frame is approximately quadratic in the accuracy of the block's motion-vector. This energy-accuracy model is then used to obtain expressions for the total bit rate (motion rate plus difference frame rate) in terms of the blocks' motion accuracies and other key parameters. Minimizing these expressions leads to simple formulas that indicate how to choose the best motion-vector accuracies for this type of coder. These formulas also show that the motion accuracy must increase where more texture is present and decrease when there is much scene noise or when the level of compression is high. We implement several entropy and MPEG-like video coders based on our analysis and present experimental results on synthetic and real video sequences. These results suggest that our formulas are accurate and that significant bit rate savings can be achieved when our optimization procedures are used View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Context-based lossless image coding using EZW framework

    Page(s): 554 - 559
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (100 KB)  

    Previous research advances have shown that wavelet-based image-compression techniques offer several advantages over traditional techniques in terms of progressive transmission capability, compression efficiency, and bandwidth utilization. The embedded zerotree wavelet (EZW) coding technique suggested by Shapiro (1992), and its modification-set partitioning in hierarchical trees (SPIHT), suggested by Said and Pearlman (19996)-demonstrate the competitive performance of wavelet-based compression schemes. The EZW-based lossless image coding framework consists of three stages: (1) reversible discrete wavelet transform; (2) hierarchical ordering and selection of wavelet coefficients; and (3) context-modeling-based entropy (arithmetic) coding. The performance of the compression algorithm depends on the choice of various parameters and the implementation strategies employed in all the three stages. This paper proposes different context modeling and selection techniques for efficient entropy encoding of wavelet coefficients, along with the modifications performed to the SPIHT algorithm. The results of several experiments presented in this paper demonstrate the importance of context modeling in the EZW framework. Furthermore, this paper shows that appropriate context modeling improves the performance of compression algorithm after a multilevel subband decomposition is performed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it