By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 4 • Date April 2005

Filter Results

Displaying Results 1 - 18 of 18
  • Table of contents

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (70 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (36 KB)  
    Freely Available from IEEE
  • Coordinated application of multiple description scalar quantization and error concealment for error-resilient MPEG video streaming

    Page(s): 457 - 468
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3296 KB) |  | HTML iconHTML  

    Historically, multiple description coding (MDC) and postprocessing error concealment (ECN) algorithms have evolved separately. In this paper, we propose a coordinated application of multiple description scalar quantizers (MDSQ) and ECN, where the smoothness of the video signal helps to compensate for the loss of descriptions. In particular, we perform a reconstruction that is consistent with the data received at the decoder. When only a single description is available, the video is reconstructed in such a way that: 1) if we were to regenerate two descriptions (from the reconstructed video), one of them would be equivalent to the received description and 2) the reconstructed video is spatiotemporally smooth. Experimental results with several video sequences demonstrated a peak signal-to-noise ratio (PSNR) improvement of 0.9-2.8 dB for intracoded frames. The PSNR improvements for intercoded frames were negligible. However, for both cases, the visual improvements were much more striking than what the PSNR improvement suggested. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiscale LMMSE-based image denoising with optimal wavelet selection

    Page(s): 469 - 481
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2400 KB) |  | HTML iconHTML  

    In this paper, a wavelet-based multiscale linear minimum mean square-error estimation (LMMSE) scheme for image denoising is proposed, and the determination of the optimal wavelet basis with respect to the proposed scheme is also discussed. The overcomplete wavelet expansion (OWE), which is more effective than the orthogonal wavelet transform (OWT) in noise reduction, is used. To explore the strong interscale dependencies of OWE, we combine the pixels at the same spatial location across scales as a vector and apply LMMSE to the vector. Compared with the LMMSE within each scale, the interscale model exploits the dependency information distributed at adjacent scales. The performance of the proposed scheme is dependent on the selection of the wavelet bases. Two criteria, the signal information extraction criterion and the distribution error criterion, are proposed to measure the denoising performance. The optimal wavelet that achieves the best tradeoff between the two criteria can be determined from a library of wavelet bases. To estimate the wavelet coefficient statistics precisely and adaptively, we classify the wavelet coefficients into different clusters by context modeling, which exploits the wavelet intrascale dependency and yields a local discrimination of images. Experiments show that the proposed scheme outperforms some existing denoising methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sender-adaptive and receiver-driven layered multicast for scalable video over the Internet

    Page(s): 482 - 495
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (920 KB) |  | HTML iconHTML  

    In this paper, we propose and analyze a new system architecture for video multicast over Internet, namely, the sender-adaptive and receiver-driven layered multicast (SARLM). In SARLM, the sender of a video source splits the video data coded by a scalable codec and a channel codec into multiple data streams, each of which corresponds to a separate multicast group. The sender can adjust the way in which the video sequence is split dynamically based on the receivers' network parameters collected through feedback. Meanwhile, a receiver can estimate available bandwidth based on a modified packet-pair technique and choose to reassemble and playback the video sequence for a given quality level by dynamically subscribing a given part or all of the data streams according to its network conditions. To optimize the sender's adaptation strategy, we introduce a quality-space (Q-Space) model to describe and analyze the mathematical relationship between the sending rate of different SARLM layers and the video quality received by a given receiver identified by its network characteristics including available bandwidth and packet loss ratio. Our simulation results demonstrate that, under the same network topology and condition, the SARLM architecture can achieve higher network throughput and better video qualities on the receiver side than the existing approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate control for videophone using local perceptual cues

    Page(s): 496 - 507
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2016 KB) |  | HTML iconHTML  

    We present a method for extracting local visual perceptual cues and its application for rate control of videophone, in order to ensure the scarce bits to be assigned for maximum perceptual coding quality. The optimum quantization step is determined with the rate-distortion model considering the local perceptual cues in the visual signal. For extraction of the perceptual cues, luminance adaptation and texture masking are used as the stimulus-driven factors, while skin color serves as the cognition-driven factor in the current implementation. Both objective and subjective quality evaluations are given by evaluating the proposed perceptual rate control (PRC) scheme in the H.263 platform, and the evaluations show that the proposed PRC scheme achieves significant quality improvement in block-based coding for bandwidth-hungry applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application specific instruction-set processor template for motion estimation in video applications

    Page(s): 508 - 527
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1704 KB) |  | HTML iconHTML  

    The gap between application specific integrated circuits (ASICs) and general-purpose programmable processors in terms of performance, power, cost and flexibility is well known. Application specific instruction-set processors (ASIPs) bridge this gap. In this work, we demonstrate the key benefits of ASIPs for several video applications. One of the most compute- and memory-intensive functions in video processing is motion estimation (ME). The focus of this work is on the design of a ME template, which is useful for several video applications like video encoding, obstacle detection, picturerate up-conversion, 2-D-to-3-D video conversion, etc. An instruction-set suitable for performing a variety of ME functions is developed. The ASIP is based on a very long instruction word (VLIW) processor template and meets low-power and low-cost requirements still providing the flexibility needed for the application domain. The ME ASIP design consumes 27 mW and takes an area of 1.1 mm2 in 0.13 μm technology performing picturerate up-conversion, for standard definition (CCIR601) resolution at 50 frames per second. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient and robust classification method using combined feature vector for lane detection

    Page(s): 528 - 537
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2624 KB) |  | HTML iconHTML  

    The aim of this paper is to develop a method for low-cost and accurate classification of highways and rural ways image pixels for lane detection. The method uses three main components: adaptive/predefined image splitting, subimage level classification and class merging based on homogeneity checking conditions. In the first step, a preclassification in road and nonroad pixels is carried out, on the resized input image, using the decision tree method. As a result of this first step we obtain the road reference feature value, and the lane-markings positions in case of highways. For the rural ways image splitting we use a predefined division method, and for the highways we use an adaptive division method based on the detected lane-markings. The proposed classification is carried out on the subimages using the K-mean classifier on a composed gray and texture based feature vector. The gray feature vector is fixed in the preclassification phase, and the texture feature vector is only updated during the classification is performed. This way the convergence is much faster and the classification accuracy is better. The resulting road and nonroad classes of subimages are merged into a road and a nonroad class using a homogeneity criterion based on the road reference feature value. Next, a forward and backward method is used to detect borders of the road region. Finally, we use the Kalman filter and the Bresenhem line drawing to connect the border pixels. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 3-D reconstruction of a dynamic environment with a fully calibrated background for traffic scenes

    Page(s): 538 - 549
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2320 KB) |  | HTML iconHTML  

    Vision-based traffic surveillance systems are more and more employed for traffic monitoring, collection of statistical data and traffic control. We present an extension of such a system that additionally uses the captured image content for 3-D scene modeling and reconstruction. A basic goal of surveillance systems is to get a good coverage of the observed area with as few cameras as possible to keep the costs low. Therefore, the 3-D reconstruction has to be done from only a few original views with limited overlap and different lighting conditions. To cope with these specific restrictions we developed a model-based 3-D reconstruction scheme that exploits a priori knowledge about the scene. The system is fully calibrated offline by estimating camera parameters from measured 3-D-2-D correspondences. Then the scene is divided into static parts, which are modeled offline and dynamic parts, which are processed online. Therefore, we segment all views into moving objects and static background. The background is modeled as multitexture planes using the original camera textures. Moving objects are segmented and tracked in each view. All segmented views of a moving object are combined to a 3-D object, which is positioned and tracked in 3-D. Here we use predefined geometric primitives and map the original textures onto them. Finally the static and dynamic elements are combined to create the reconstructed 3-D scene, where the user can freely navigate, i.e., choose an arbitrary viewpoint and direction. Additionally, the system allows analyzing the 3-D properties of the scene and the moving objects. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A constrained nonlinear energy minimization framework for the regularization of the stereo correspondence problem

    Page(s): 550 - 565
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2976 KB) |  | HTML iconHTML  

    In this paper, we propose a novel approach to stereo correspondence based on the optimization of a continuous disparity surface defined parametrically using radial basis functions. Principal advantages over other methods include the use of constrained nonlinear programming to perform regularization as a hierarchical multiobjective optimization which differs from the standard weighted sum approach, so that regularization becomes more consistent with the notion of Pareto optimality. Furthermore, the optimization algorithm is capable of handling arbitrary constraints on the sought parameters, so that a variety of types of a priori scene information can be incorporated explicitly to the problem definition. To exemplify this we derive a new continuous unary formulation of the disparity gradient limit constraint and propose other types of potential constraints for a priori knowledge. Furthermore, the optimization employs a smoothness oriented regularization operator to preserve surface discontinuities, a flexible block decomposition approach of the disparity surface to allow parallelization and a correlation-based fitting with heuristics to initialize the parameters and avoid local optima effectively. Experiments with standard stereo imagery show that the method handles adequately the imposed constraints and produces surfaces with accurate level of elevation detail. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A heuristic approach for finding best focused shape

    Page(s): 566 - 574
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1944 KB) |  | HTML iconHTML  

    The most popular shape from focus (SFF) methods in the literature are based on the concept of focused image surface (FIS)-the surface formed by the best focus points. According to paraxial-geometric optics, there is one-to-one correspondence between the shape of an object and the shape of its FIS. Therefore, the problem of three-dimensional (3-D) shape recovery from image focus can be described as the problem of determining the shape of the FIS. The conventional SFF method is inaccurate because of piecewise constant approximation of the FIS. The SFF method based on the FIS has shown better results by exhaustive search of the FIS shape using planar surface approximation at the cost of considerably higher computations. In this paper, search of the FIS shape is presented as an optimization problem, i.e., maximization of the focus measure in the 3-D image volume. The proposed method searches the optimal focus measure in the whole image volume, instead of the small volume as adopted in previous methods. The dynamic programming, instead of the approximation techniques, is used to search the optimal FIS shape. A direct application of dynamic programming on a 3-D data is impractical, because of higher computational complexity. Therefore a fast heuristic model based on dynamic programming is proposed for the search of FIS shape. The shape recovery results of the new method are better than previous methods. The proposed algorithm is significantly faster than the FIS algorithm, but a little slower than the conventional algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tracking video objects in cluttered background

    Page(s): 575 - 584
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3104 KB) |  | HTML iconHTML  

    We present an algorithm for tracking video objects which is based on a hybrid strategy. This strategy uses both object and region information to solve the correspondence problem. Low-level descriptors are exploited to track object's regions and to cope with track management issues. Appearance and disappearance of objects, splitting and partial occlusions are resolved through interactions between regions and objects. Experimental results demonstrate that this approach has the ability to deal with multiple deformable objects, whose shape varies over time. Furthermore, it is very simple, because the tracking is based on the descriptors, which represent a very compact piece of information about regions, and they are easy to define and track automatically. Finally, this procedure implicitly provides one with a description of the objects and their track, thus enabling indexing and manipulation of the video content. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Call for participation for 2005 IEEE International Symposium on Circuits and Systems (ISCAS2005)

    Page(s): 585
    Save to Project icon | Request Permissions | PDF file iconPDF (523 KB)  
    Freely Available from IEEE
  • Explore IEL IEEE's most comprehensive resource

    Page(s): 586
    Save to Project icon | Request Permissions | PDF file iconPDF (341 KB)  
    Freely Available from IEEE
  • Quality without compromise [advertisement]

    Page(s): 587
    Save to Project icon | Request Permissions | PDF file iconPDF (319 KB)  
    Freely Available from IEEE
  • IEEE order form for reprints

    Page(s): 588
    Save to Project icon | Request Permissions | PDF file iconPDF (378 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (30 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it