By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 4 • Date April 2014

Filter Results

Displaying Results 1 - 18 of 18
  • Table of contents

    Publication Year: 2014 , Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (62 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Publication Year: 2014 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (140 KB)  
    Freely Available from IEEE
  • Evaluation of Different Algorithms of Nonnegative Matrix Factorization in Temporal Psychovisual Modulation

    Publication Year: 2014 , Page(s): 553 - 565
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (12374 KB)  

    Temporal psychovisual modulation (TPVM) is a newly proposed information display paradigm, which can be implemented by nonnegative matrix factorization (NMF) with additional upper bound constraints on the variables. In this paper, we study all the state-of-the-art algorithms in NMF, extend them to incorporate the upper bounds and discuss their potential use in TPVM. By comparing all the NMF algorithms with their extended versions, we find that: 1) the factorization error of the truncated alternating least squares algorithm always fluctuates throughout the iterations, 2) the alternating nonnegative least squares based algorithms may slow down dramatically under the upper bound constraints, and 3) the hierarchical alternating least squares (HALS) algorithm converges the fastest and its final factorization error is often the smallest among all the algorithms. Based on the experimental results of the HALS, we propose a guideline of determining the parameter setting of TPVM, that is, the number of viewers to support and the scaling factor for adjusting the light intensity of the images formed by TPVM. This paper will facilitate the applications of TPVM. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bare-fingers Touch Detection by the Button's Distortion in a Projector–Camera System

    Publication Year: 2014 , Page(s): 566 - 575
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (8504 KB)  

    In this paper, we propose a novel interactive projection system (IPS), which enables bare-finger touch interaction on regular planar surfaces (e.g., walls, tables), with only one standard camera and one projector. The challenge of bare-finger touch detection is recovering the touching information just from the 2-D image captured by the camera. In our method, the graphical user interface (GUI) button is projected on the surface and is distorted by the finger when clicking it, and there is a significant positive correlation between the button's distortion and the finger's height to the surface. Therefore, we propose a novel, fast, and robust algorithm, which takes advantage of the button's distortion to detect the touch action. The proposed touch detection algorithm is performed in three stages: 1) region of interest extraction through a homography mapping, by which the computational complexity of the following processing is reduced; 2) the button's distortion detection using a special edge detection algorithm, which greatly reduces the errors due to the influence of the finger's shadows and edges; and 3) touch action judgment by the button's distortion. Several applications (e.g., virtual keyboard, power point viewing), which use the proposed touch detection method based on the buttons, are shown in this paper. An evaluation is performed on the virtual keyboard and the results demonstrate that the proposed approach can detect bare-finger touch in real time with the missed detection rate of 1.00%, false detection rate of 2.08%, and touch detection rate of 96.92% at the typical projected distance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimized Brightness Compensation and Contrast Enhancement for Transmissive Liquid Crystal Displays

    Publication Year: 2014 , Page(s): 576 - 590
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (72803 KB)  

    An optimized brightness-compensated contrast enhancement (BCCE) algorithm for transmissive liquid crystal displays (LCDs) is proposed in this paper. We first develop a global contrast enhancement scheme to compensate for the reduced brightness when the backlight of an LCD device is dimmed for power reduction. We also derive a distortion model to describe the information loss due to the brightness compensation. Then, we formulate an objective function that consists of the contrast enhancement term and the distortion term. By minimizing the objective function, we maximize the backlight-scaled image contrast, subject to the constraint on the distortion. Simulation results show that the proposed BCCE algorithm provides high-quality images, even when the backlight intensity is reduced by up to 50-70% to save power. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • No-Reference Quality Assessment for Stereoscopic Images Based on Binocular Quality Perception

    Publication Year: 2014 , Page(s): 591 - 602
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (13130 KB)  

    Quality perception of 3-D images is one of the most important parameters for accelerating advances in 3-D imaging fields. Despite active research in recent years for understanding the quality perception of 3-D images, binocular quality perception of asymmetric distortions in stereoscopic images is not thoroughly comprehended. In this paper, we explore the relationship between the perceptual quality of stereoscopic images and visual information, and introduce a model for binocular quality perception. Based on this binocular quality perception model, a no-reference quality metric for stereoscopic images is proposed. The proposed metric is a top-down method modeling the binocular quality perception of the human visual system in the context of blurriness and blockiness. Perceptual blurriness and blockiness scores of left and right images were computed using local blurriness, blockiness, and visual saliency information and then combined into an overall quality index using the binocular quality perception model. Experiments for image and video databases show that the proposed metric provides consistent correlations with subjective quality scores. The results also show that the proposed metric provides higher performance than existing full-reference methods even though the proposed method is a no-reference approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliability-Based Multiview Depth Enhancement Considering Interview Coherence

    Publication Year: 2014 , Page(s): 603 - 616
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (26043 KB)  

    Color-plus-depth video format has been increasingly popular in 3-D video applications, such as auto-stereoscopic 3-D TV and freeview TV. The performance of these applications is heavily dependent on the quality of depth maps since intermediate views are synthesized using the corresponding depth maps. This paper presents a novel framework for obtaining high-quality multiview color-plus-depth video using a hybrid sensor, which consists of multiple color cameras and depth sensors. Given multiple high-resolution color images and low quality depth maps obtained from the color cameras and depth sensors, we improve the quality of the depth map corresponding to each color view by increasing its spatial resolution and enforcing interview coherence. Specifically, a new up-sampling method considering the interview coherence is proposed to enhance multiview depth maps. This approach can improve the performance of the existing up-sampling algorithms, such as joint bilateral up-sampling and weighted mode filtering, which have been developed to enhance a single-view depth map only. In addition, an adaptive approach of fusing multiple input low-resolution depth maps is proposed based on the reliability that considers camera geometry and depth validity. The proposed framework can be extended into the temporal domain for temporally consistent depth maps. Experimental results demonstrate that the proposed method provides better multiview depth quality than the conventional single-view-based methods. We also show that it provides comparable results, yet much more efficiently, to other fusion approaches that employ both depth sensors and stereo matching algorithm together. Moreover, it is shown that the proposed method significantly reduces bit rates required to compress the multiview color-plus-depth video. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiview Gait Recognition Based on Patch Distribution Features and Uncorrelated Multilinear Sparse Local Discriminant Canonical Correlation Analysis

    Publication Year: 2014 , Page(s): 617 - 630
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (12266 KB)  

    It is well recognized that gait is an important biometric feature to identify a person at a distance, such as in video surveillance application. However, in reality, a change of viewing angle causes a significant challenge for gait recognition. In this paper, a novel approach is proposed for multiview gait recognition with the view angle of a probe gait sequence unknown. We formulate a new patch distribution feature based classification framework to estimate the view angle of each probe gait sequence. In this method, each gait energy image is represented as a set of dual-tree complex wavelet transform (DTCWT) features derived from different scales and orientations together with the x-y coordinates. Then, a two-stage Gaussian mixture model is presented that can represent each DTCWT based gait feature with a set of patch distribution parameters. A simple nearest-neighbor classifier is employed for view classification. To measure the similarity of gait sequences, we also propose a sparse local discriminant canonical correlation analysis algorithm to model the correlation of gait features from different views and use the correlation strength as similarity measure. An uncorrelated multilinear SLDCCA (UMSLDCCA) framework is further presented that aims to extract uncorrelated discriminative features directly from multidimensional gait features through solving a tensor-to-vector projection. The solution consists of sequential iterative processes based on the alternating projection method. Different from existing approaches, UMSLDCCA considers the spatial structure information within each gait sample and local geometry information among multiple gait samples. Moreover, our approach does not need explicit reconstruction and is robust against feature noise. Extensive experiments have been performed on two benchmark gait databases. The results demonstrate that our method outperforms the state-of-the-art methods in terms of accuracy and efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive Sparse Representations for Video Anomaly Detection

    Publication Year: 2014 , Page(s): 631 - 645
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (30418 KB)  

    Video anomaly detection can be used in the transportation domain to identify unusual patterns such as traffic violations, accidents, unsafe driver behavior, street crime, and other suspicious activities. A common class of approaches relies on object tracking and trajectory analysis. Very recently, sparse reconstruction techniques have been employed in video anomaly detection. The fundamental underlying assumption of these methods is that any new feature representation of a normal/anomalous event can be approximately modeled as a (sparse) linear combination prelabeled feature representations (of previously observed events) in a training dictionary. Sparsity can be a powerful prior on model coefficients but challenges remain in the detection of anomalies involving multiple objects and the ability of the linear sparsity model to effectively allow for class separation. The proposed research addresses both these issues. First, we develop a new joint sparsity model for anomaly detection that enables the detection of joint anomalies involving multiple objects. This extension is highly nontrivial since it leads to a new simultaneous sparsity problem that we solve using a greedy pursuit technique. Second, we introduce nonlinearity into, that is, kernelize. The linear sparsity model to enable superior class separability and hence anomaly detection. We extensively test on several real world video datasets involving both single and multiple object anomalies. Results show marked improvements in detection of anomalies in both supervised and unsupervised scenarios when using the proposed sparsity models. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Spatiotemporal Saliency Detection Using Textural Contrast and Its Applications

    Publication Year: 2014 , Page(s): 646 - 659
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (80471 KB)  

    Saliency detection has been extensively studied due to its promising contributions for various computer vision applications. However, most existing methods are easily biased toward edges or corners, which are statistically significant, but not necessarily relevant. Moreover, they often fail to find salient regions in complex scenes due to ambiguities between salient regions and highly textured backgrounds. In this paper, we present a novel unified framework for spatiotemporal saliency detection based on textural contrast. Our method is simple and robust, yet biologically plausible; thus, it can be easily extended to various applications, such as image retargeting, object segmentation, and video surveillance. Based on various datasets, we conduct comparative evaluations of 12 representative saliency detection models presented in the literature, and the results show that the proposed scheme outperforms other previously developed methods in detecting salient regions of the static and dynamic scenes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Intra Mode Decision for High Efficiency Video Coding (HEVC)

    Publication Year: 2014 , Page(s): 660 - 668
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (11196 KB)  

    The latest High Efficiency Video Coding (HEVC) standard only requires 50% bit-rate of the H.264/AVC at the same perceptual quality, but with a significant encoder complexity increase. Hence, it is necessary and inevitable to develop fast HEVC encoding algorithms for its potential market adoption. In this paper, we propose a fast intra mode decision for the HEVC encoder. The overall fast intra mode decision algorithm consists of both micro- and macro-level schemes. At the micro-level, we propose the Hadamard cost-based progressive rough mode search (pRMS) to selectively check the potential modes instead of traversing all candidates (i.e., up to 35 in HEVC). Fewer effective candidates will be chosen by the pRMS for the subsequent rate-distortion optimized quantization (RDOQ) to derive the rate-distortion (R-D) optimal mode. An early RDOQ skip method is also introduced to further the complexity reduction. At the macrolevel, we introduce the early coding unit (CU) split termination if the estimated R-D cost [through aggregated R-D costs of (partial) sub-CUs] is already larger than the R-D cost of the current CU. On average, the proposed fast intra mode decision provides about 2.5 × speedup (without any platform or source code level optimization) with just a 1.0% Bjontegaard delta rate (BD-rate) increase using the HEVC common test condition. Moreover, our proposed solution also demonstrates the state-of-the-art performance in comparison with other works. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Cost–QoE Tradeoff for Cloud-Based Video Streaming Under Amazon EC2's Pricing Models

    Publication Year: 2014 , Page(s): 669 - 680
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (9335 KB)  

    The emergence of cloud computing provides a cost-effective approach to deliver video streams to a large number of end users with the desired user quality of experience (QoE). Under such a paradigm, a video service provider (VSP) can launch its own video streaming services virtually by renting the distribution infrastructure from one or more cloud service providers (CSPs). However, CSPs such as Amazon EC2 normally offer multiple pricing options for virtual machine (VM) instances that they can provide, such as on-demand instances, reserved instances, and spot instances. Such diverse pricing models make it challenging for a VSP to determine how to optimally procure the required number of VM instances in different types to satisfy dynamic user demands. Given the limited budget, a VSP needs to carefully balance the procurement cost and the achieved QoE for end users. In this paper, we investigate the tradeoff between the cost incurred by VM instance procurement and the achieved QoE of end users under Amazon EC2's pricing models, and formulate the VM instance provisioning and procurement problem into a constrained stochastic optimization problem. By applying the Lyapunov optimization framework, we design an online procurement algorithm, which approaches the optimal solution with explicitly provable upper bounds. We also conduct extensive trace-driven simulations and our results show that our proposed algorithm (OPT-ORS) achieves a good balance between the procurement cost and the user QoE for cloud-based VSPs. In the achieved near-optimal situation, our algorithm guarantees that reserved VM instances are fully utilized to satisfy the baseline user demand, on-demand VM instances are only rented to handle flash crowds, while more spot VM instances are rented than on-demand VM instances to serve user demand over the baseline due to their low prices. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Control-Theoretic Approach to Rate Adaption for DASH Over Multiple Content Distribution Servers

    Publication Year: 2014 , Page(s): 681 - 694
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (27070 KB)  

    Recently, dynamic adaptive streaming over HTTP (DASH) has been widely deployed on the Internet. However, the research about DASH over multiple content distribution servers (MCDS-DASH) is limited. Compared with traditional single-server DASH, MCDS-DASH is able to offer expanded bandwidth, link diversity, and reliability. It is, however, a challenging problem to smooth video bitrate switching over multiple servers due to their diverse bandwidths. In this paper, we propose a block-based rate adaptation method considering both the diverse bandwidths and feedback buffered video time. In our method, multiple fragments are grouped into a block and the fragments are downloaded in parallel from multiple servers. We propose to adapt video bitrate at the block level rather than at the fragment level. By dynamically adjusting the block length and scheduling fragment requests to multiple servers, the requested video bitrates from the multiple servers are synchronized, making the fragments download in an orderly way. Then, we propose a control-theoretic approach to select an appropriate bitrate for each block. By modeling and linearizing the rate adaption system, we propose a novel proportional-derivative controller to adapt video bitrate with high responsiveness and stability. Theoretical analysis and extensive experiments on our network testbed and the Internet demonstrate the good efficiency of the proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A New Secure Image Transmission Technique via Secret-Fragment-Visible Mosaic Images by Nearly Reversible Color Transformations

    Publication Year: 2014 , Page(s): 695 - 703
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (10068 KB) |  | HTML iconHTML  

    A new secure image transmission technique is proposed, which transforms automatically a given large-volume secret image into a so-called secret-fragment-visible mosaic image of the same size. The mosaic image, which looks similar to an arbitrarily selected target image and may be used as a camouflage of the secret image, is yielded by dividing the secret image into fragments and transforming their color characteristics to be those of the corresponding blocks of the target image. Skillful techniques are designed to conduct the color transformation process so that the secret image may be recovered nearly losslessly. A scheme of handling the overflows/underflows in the converted pixels' color values by recording the color differences in the untransformed color space is also proposed. The information required for recovering the secret image is embedded into the created mosaic image by a lossless data hiding scheme using a key. Good experimental results show the feasibility of the proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Semisupervised Hashing via Kernel Hyperplane Learning for Scalable Image Search

    Publication Year: 2014 , Page(s): 704 - 713
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (720 KB)  

    Hashing methods that aim to seek a compact binary code for each image are demonstrated to be efficient for scalable content-based image retrieval. In this paper, we propose a new hashing method called semisupervised kernel hyperplane learning (SKHL) for semantic image retrieval by modeling each hashing function as a nonlinear kernel hyperplane constructed from an unlabeled dataset. Moreover, a Fisher-like criterion is proposed to learn the optimal kernel hyperplanes and hashing functions, using only weakly labeled training samples with side information. To further integrate different types of features, we also incorporate multiple kernel learning (MKL) into the proposed SKHL (called SKHL-MKL), leading to better hashing functions. Comprehensive experiments on CIFAR-100 and NUS-WIDE datasets demonstrate the effectiveness of our SKHL and SKHL-MKL. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cost-Effective Hardware-Sharing Design of Fast Algorithm Based Multiple Forward and Inverse Transforms for H.264/AVC, MPEG-1/2/4, AVS, and VC-1 Video Encoding and Decoding Applications

    Publication Year: 2014 , Page(s): 714 - 720
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (8868 KB)  

    In this letter, multiple forward and inverse fast algorithm based transforms and their hardware-sharing design for 2 × 2, 4 × 4, and 8 × 8 transforms in H.264/AVC, and the 8 × 8 transform in audio video coding standard, 4 × 4 and 8 × 8 transforms in VC-1, and DCT/IDCT in MPEG-1/2/4 are developed with a low hardware cost for multistandard video coding applications. Compared with the directly combined fast transforms without shares, the proposed low-cost 1-D architecture reduces shifters by 67%, adders by 73%, and gate counts by 53.4%. The hardware-sharing efficiencies of shifters and adders in the proposed 1-D transform design are 32% and 25% more than those in the previous design, respectively. By 0.18-μm CMOS technology, the proposed 2-D transform architecture has less normalized power per mode and larger normalized hardware efficiency than the previous multiple-standard designs. The cost-effective 2-D full pipelined transform achieves multistandard real-time 1080HD at 60-Hz video encoding and decoding applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Circuits and Systems Society Information

    Publication Year: 2014 , Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (119 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology information for authors

    Publication Year: 2014 , Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (119 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it