By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 6 • Date June 2004

Filter Results

Displaying Results 1 - 25 of 25
  • Table of contents

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (107 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE
  • Optimal content-based video decomposition for interactive video navigation

    Page(s): 757 - 775
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1112 KB) |  | HTML iconHTML  

    In this paper, an interactive framework for navigating video sequences is presented using an optimal content-based video decomposition scheme. In particular, each video sequence is analyzed at different content resolution levels, creating a hierarchy from the lowest (coarse) to the highest (fine) resolution. This content hierarchy is represented as a tree structure, each level of which corresponds to a particular content resolution, while the tree nodes indicate the temporal video segments that the sequence content is partitioned at a given resolution. A criterion is introduced to measure the efficiency of the proposed scheme in organizing the video visual content and to compare it with other hierarchical video content representations and navigation schemes. The efficiency is measured as the difficulty for a user to locate a video segment of interest, while moving through different levels of hierarchy. In our case, video is decomposed so that the best efficiency is accomplished. However, the efficiency of a nonlinear video decomposition scheme depends on: 1) the number of paths required for a user to locate a relevant video segment and 2) the number of shot/frame classes (i.e., content representatives) extracted to represent the visual content. Both issues are addressed in this paper. In the first case, the probability of selecting a relevant video segment in the first path is maximized by extracting optimal content representatives through a minimization of a cross-correlation criterion. For the minimization, a genetic algorithm (GA) is adopted, since application of an exhaustive search to obtain the minimum value is too large to be implemented. The cross-correlation criterion is evaluated on the feature domain by extracting appropriate global and object-based descriptors for each video frame so that a better representation of the visual content is achieved. The second aspect (e.g., the number of content representatives) is addressed by minimizing the average transmitted information and simultaneously taking into consideration the temporal video segment complexity. More content representatives are extracted for video segments of high complexity, whereas a low number is required for low-complexity segments. In addition, a degree of interest is assigned to each- video shot (or frame) to address the fact that, from the user's perception, the visual content of a set of shots (frames) satisfies his/her information needs. Finally, a computationally efficient algorithm is proposed to regulate the degree of detail (i.e., the number of shot/frames representatives) in case the visual content is not efficiently represented from the user's perceptive view. Experimental results on real-life video sequences indicate the performance of the proposed GA-based video decomposition scheme compared to other hierarchical video organization methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust segmentation and tracking of colored objects in video

    Page(s): 776 - 781
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB) |  | HTML iconHTML  

    Segmenting and tracking of objects in video is of great importance for video-based encoding, surveillance, and retrieval. However, the inherent difficulty of object segmentation and tracking is to distinguish changes in the displacement of objects from disturbing effects such as noise and illumination changes. Therefore, in this paper, we formulate a color-based deformable model which is robust against noisy data and changing illumination. Computational methods are presented to measure color constant gradients. Further, a model is given to estimate the amount of sensor noise through these color constant gradients. The obtained uncertainty is subsequently used as a weighting term in the deformation process. Experiments are conducted on image sequences recorded from three-dimensional scenes. From the experimental results, it is shown that the proposed color constant deformable method successfully finds object contours robust against illumination, and noisy, but homogeneous regions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video object segmentation using Bayes-based temporal tracking and trajectory-based region merging

    Page(s): 782 - 795
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (744 KB) |  | HTML iconHTML  

    A novel unsupervised video object segmentation algorithm is presented, aiming to segment a video sequence to objects: spatiotemporal regions representing a meaningful part of the sequence. The proposed algorithm consists of three stages: initial segmentation of the first frame using color, motion, and position information, based on a variant of the K-means-with-connectivity-constraint algorithm; a temporal tracking algorithm, using a Bayes classifier and rule-based processing to reassign changed pixels to existing regions and to efficiently handle the introduction of new regions; and a trajectory-based region merging procedure that employs the long-term trajectory of regions, rather than the motion at the frame level, so as to group them to objects with different motion. As shown by experimental evaluation, this scheme can efficiently segment video sequences with fast moving or newly appearing objects. A comparison with other methods shows segmentation results corresponding more accurately to the real objects appearing on the image sequence. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic moving object extraction for content-based applications

    Page(s): 796 - 812
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (728 KB) |  | HTML iconHTML  

    Rapid developments in the Internet and multimedia applications allow us to access large amounts of image and video data. While significant progress has been made in digital data compression, content-based functionalities are still quite limited. Many existing techniques in content-based retrieval are based on global visual features extracted from the entire image. In order to provide more efficient content-based functionalities for video applications, it is necessary to extract meaningful video objects from scenes to enable object-based representation of video content. Object-based representation is also introduced by MPEG-4 to enable content-based functionality and high coding efficiency. In this paper, we propose a new algorithm that automatically extracts meaningful video objects from video sequences. The algorithm begins with the robust motion segmentation on the first two successive frames. To detect moving objects, segmented regions are grouped together according to their spatial similarity. A binary object model for each moving object is automatically derived and tracked in subsequent frames using the generalized Hausdorff distance. The object model is updated for each frame to accommodate for complex motions and shape changes of the object. Experimental results using different types of video sequences are presented to demonstrate the efficiency and accuracy of our proposed algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-time object segmentation and coding for selective-quality video communications

    Page(s): 813 - 824
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (512 KB) |  | HTML iconHTML  

    The MPEG-4 standard enables the representation of video as a collection of objects. This paper describes an automatic system that exploits such a representation. Our system consists of two parts: real-time content extraction algorithms and a real-time multi-object rate control method. We present two approaches to content extraction: foreground segmentation based on two cameras and face segmentation based on a single camera. The main contributions of this paper are: 1) under a stereo camera setup, we improve a disparity estimation algorithm to obtain crisp and smooth boundaries of foreground objects; 2) for a single camera scenario, we propose a novel algorithm for face detection and tracking, combining facial color and structure information; and 3) we develop a constant-quality variable bitrate (CQ-VBR) control algorithm that guarantees the quality specification for each object obtained from the two content extraction methods. Both segmentation algorithms run in real-time on a low-cost media processor, and have been tested extensively in various indoor environments. The CQ-VBR control algorithm is a useful tool for the evaluation of object-based coding. For low-bit-rate applications, we can achieve significant reduction in the overall bitrate, while maintaining the same visual quality of the foreground/face object as compared to conventional frame-based coding. Based on tests conducted on several sequences of different complexity levels, the bit-rate savings can be up to 48%. The satisfactory foreground segmentation (results presented) permits porting a live foreground object into arbitrary scenes to create composite video. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An encoder-decoder texture replacement method with application to content-based movie coding

    Page(s): 825 - 840
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1016 KB) |  | HTML iconHTML  

    In this paper, we exploit the stylistic characteristics of high-quality entertainment movie sequences in terms of their textured background for coding purposes. More specifically, we propose a content-based coding method by texture replacement. At the encoder, texture is removed from selected regions of the original frames. The resulting frames with the texture removed and the parameters of the removed texture are then encoded. At the decoder, the boundaries of the regions without texture are identified and new texture, which is synthesized using the decoded texture parameters, is mapped onto these regions. Our experimental results confirm the main advantages of the proposed texture replacement method: significant bit rate reduction of the compressed movie sequences with the texture removed, and higher visual quality of the textured background regions in the decoded movie sequences with synthesized texture than that of the regions in the sequences simply encoded and decoded. Even more, our method can be applied as an overlay onto any standards-compliant coding system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive media playout for low-delay video streaming over error-prone channels

    Page(s): 841 - 851
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (504 KB) |  | HTML iconHTML  

    When media is streamed over best-effort networks, media data is buffered at the client to protect against playout interruptions due to packet losses and random delays. While the likelihood of an interruption decreases as more data is buffered, the latency that is introduced increases. In this paper we show how adaptive media playout (AMP), the variation of the playout speed of media frames depending on channel conditions, allows the client to buffer less data, thus introducing less delay, for a given buffer underflow probability. We proceed by defining models for the streaming media system and the random, lossy, packet delivery channel. Our streaming system model buffers media at the client, and combats packet losses with deadline-constrained automatic repeat request (ARQ). For the channel, we define a two-state Markov model that features state-dependent packet loss probability. Using the models, we develop a Markov chain analysis to examine the tradeoff between buffer underflow probability and latency for AMP-augmented video streaming. The results of the analysis, verified with simulation experiments, indicate that AMP can greatly improve the tradeoff, allowing reduced latencies for a given buffer underflow probability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient recursive shortest spanning tree algorithm using linking properties

    Page(s): 852 - 863
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB) |  | HTML iconHTML  

    Speed is a great concern in the recursive shortest spanning tree (RSST) algorithm as its applications are focused on image segmentation and video coding, in which a large amount of data is processed. Several efficient RSST algorithms have been proposed in the literature, but the linking properties are not properly addressed and used in these algorithms and they are intended to produce a truncated RSST. This paper categorizes the linking process into three classes based on link weights. These linking processes are defined as the linking process for link weight equal to zero (LPLW-Z), the linking process for link weight equal to one (LPLW-O), and the linking process for link weight equal to real number (LPLW-R). We study these linking properties and apply them to an efficient RSST algorithm. The proposed efficient RSST algorithm is novel, as it makes use of linking properties, and its resulting shortest spanning tree is truly identical to that produced by the conventional algorithm. Our experimental results show that the percentages of links for the three classes are 17%, 27%, and 58%, respectively. This paper proposes a prediction method for LPLW-O, as a result of which the vertex weight of the next region can be determined by comparing sizes of the merging regions. It is also demonstrated that the proposed LPLW-O with prediction approach is applicable to the multiple-stage merging. Our experimental results show that the proposed algorithm has a substantial improvement over the conventional RSST algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modified TMN8 rate control for low-delay video communications

    Page(s): 864 - 868
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (216 KB) |  | HTML iconHTML  

    The existing macroblock-layer rate control schemes in the literature calculate quantization parameters of all macroblocks (MBs) in a frame in a raster scan order, and then encode the MBs in the same order. Actually, the quantization distortion is heavily dependent upon the coding order of MBs. This work investigates the relationship between quantization distortion and the coding order. Then we present a scheme where we modify the encoding order of MBs in TMN8 to favor the more complex MBs. We implement TMN8 and the modified version in H.263 video codec. The experimental results indicate that our scheme achieves average PSNR gain of 1.05 dB over TMN8. In addition, it performs better in buffer overflow and underflow, and the average bit rate achieved is closer to the target channel rate. The new rate control scheme is fully compliant to H.263 coding standard. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Linear rate-distortion models for MPEG-4 shape coding

    Page(s): 869 - 873
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (368 KB) |  | HTML iconHTML  

    In this letter, we explore the rate-distortion (R-D) characteristics of binary shape information in the MPEG-4 standard. The shape coding scheme is a block-based context-based arithmetic encoding approach and distortion is introduced by down and up sampling. Generally, the shape bit rate and nonnormalized distortion increase in proportion with the number of border blocks. At any given resolution scale, the more complex an object, the more distortion that is introduced. Consequently, we propose an R-D model based on the parameters derived from the border blocks and a block-based shape complexity for the video object. The computational complexity is much lower than other existing methods. Experimental results show that it can accurately predict the bit rate and distortion of the binary shape for rate control purposes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mode-based error-resilient techniques for the robust communication of MPEG-4 video

    Page(s): 874 - 879
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (296 KB) |  | HTML iconHTML  

    Transmission of compressed video over error-prone channels requires techniques that are efficient in compressing the video and are robust against the channel errors. The application of MPEG-4 can conform to the bandwidth limitation, but the highly compressed bits will be more susceptible to noise errors. In this paper, we present a mode-based error detection (MED) technique and a mode-based unequal error protection (M-UEP) technique to provide the robust video transmission of MPEG-4 compressed bits over error-prone channels. The MED technique can detect errors effectively and efficiently, which cannot be detected traditionally. Experimental results show that the error-detection ratio is more than 90%, and sometimes even 100%. Concerning the M-UEP, experimental results prove that it can provide better decoded video quality than the traditional unequal error protection technique does. In conclusion, the two proposed techniques are highly effective and efficient for the robust transport of MPEG-4 video over error-prone channels. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dictionary design for matching pursuit and application to motion-compensated video coding

    Page(s): 880 - 886
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (320 KB) |  | HTML iconHTML  

    We present a new algorithm for matching pursuit (MP) dictionary design. This technique uses existing vector-quantization design techniques and an inner product-based distortion measure to learn functions from a set of training patterns. While this scheme can be applied to many MP applications, we focus on motion-compensated video coding. Given a set of training sequences, data are extracted from the high-energy packets of the motion-compensated frames. Dictionaries with different regions of support are trained, pruned, and finally evaluated on MPEG test sequences. We find that for high bit-rate QCIF sequences we can achieve improvements of up to 0.66 dB with respect to conventional MP with separable Gabor functions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient arbitrary downsizing algorithm for video transcoding

    Page(s): 887 - 891
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB) |  | HTML iconHTML  

    When delivering video over communication networks, it is required to transcode the precoded video content to meet the demands of a broad range of end users with different bandwidths and resource constraints. One solution to transmit video over bandwidth-constrained channels is to reduce the spatial resolution of video frames and transmit a low-resolution version of the video as a tradeoff for the bit rate. In this paper, an arbitrary downsizing algorithm is proposed. This arbitrary downsizing algorithm is processed directly in the discrete cosine transform domain. Experimental results show that the proposed method can achieve a satisfactory performance. Compared to the existing methods, this algorithm is not only applicable to an intracoding frame but also to an intercoding frame. This is one of the crucial features of this method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image scrambling without bandwidth expansion

    Page(s): 892 - 897
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (464 KB)  

    Image-scrambling schemes are designed to render the image content unintelligible. Wyner has proposed an elegant one-dimensional (1-D) scrambling scheme without bandwidth expansion, making use of the discrete prolate spheroidal sequences (DPSS). The DPSS are optimal regarding their energy concentration in a given frequency subband. In this paper, we propose the two-dimensional (2-D) extension and application of this algorithm. We discuss new possibilities introduced by the 2-D approach. We also include experimental results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Global elimination algorithm and architecture design for fast block matching motion estimation

    Page(s): 898 - 907
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (560 KB) |  | HTML iconHTML  

    This paper presents a new block matching motion estimation algorithm and its VLSI architecture design. The proposed global elimination algorithm (GEA) was derived from successive elimination algorithm (SEA), which can skip unnecessary sum of absolute difference (SAD) calculation by comparing minimum SAD with subsampled SAD (SSAD). Our basic idea is to separate the decision of early termination and SAD calculation for each candidate block to make data flow more regular and suitable for hardware. In short, we first compare the rough characteristics of all candidate blocks with the current block (SSAD). In turn, we select several best roughly matched candidate blocks to re-compare them with the current block by using detailed characteristics (SAD). Other features of GEA include fixed processing cycles, no initial guess, and high video quality (almost the same as full search). Unlike other fast algorithms, the mapping of GEA to hardware is very simple. We proposed an architecture that is composed of a systolic part to efficiently compute SSAD, an adder tree to support both SSAD and SAD calculations, and a comparator tree to avoid expensive sorting circuits. Simulation results show that our design is much more area efficient than many full-search architectures while maintaining high video quality and processing capability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast binary motion estimation algorithm for MPEG-4 shape coding

    Page(s): 908 - 913
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB)  

    This paper presents a fast binary motion estimation (BME) algorithm using diamond search pattern for MPEG-4 shape coding, which is the key technology for supporting the content-based video coding. Based on the properties of binary shape information, a boundary mask for efficient search positions can be generated. Therefore, a large number of search points can be skipped. Simulation results show that our algorithm combined with diamond shaped zones takes equal bit rate in the same quality but reduces the number of search points marvelously in BME to 0.6% compared with full search algorithm, which is described in MPEG-4 verification mode. The proposed algorithm will reduce computational complexity of shape coding significantly and be suitable for real-time software and hardware applications of MPEG-4 shape coding. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Demosaicked image postprocessing using local color ratios

    Page(s): 914 - 920
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (528 KB) |  | HTML iconHTML  

    A postprocessing method for the correction of visual demosaicking artifacts is introduced. The restored, full-color images previously obtained by cost-effective color filter array interpolators are processed to improve their visual quality. Based on a localized color ratio model and the original underlying Bayer pattern structure, the proposed solution impressively removes false colors while maintaining image sharpness. At the same time, it yields excellent improvements in terms of objective image quality measures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2005 IEEE International Symposium on Circuits and Systems (ISCAS 2005)

    Page(s): 921
    Save to Project icon | Request Permissions | PDF file iconPDF (520 KB)  
    Freely Available from IEEE
  • Proceedings of the IEEE celebrating 92 years of in-depth coverage on emerging technologies

    Page(s): 922
    Save to Project icon | Request Permissions | PDF file iconPDF (319 KB)  
    Freely Available from IEEE
  • Explore IEL IEEE's most comprehensive resource [advertisement]

    Page(s): 923
    Save to Project icon | Request Permissions | PDF file iconPDF (341 KB)  
    Freely Available from IEEE
  • Quality without compromise [advertisement]

    Page(s): 924
    Save to Project icon | Request Permissions | PDF file iconPDF (319 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (30 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it