Scheduled System Maintenance on May 29th, 2015:
IEEE Xplore will be upgraded between 11:00 AM and 10:00 PM EDT. During this time there may be intermittent impact on performance. We apologize for any inconvenience.
By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 5 • Date May 2015

Filter Results

Displaying Results 1 - 18 of 18
  • Table of contents

    Publication Year: 2015 , Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (163 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Publication Year: 2015 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (138 KB)  
    Freely Available from IEEE
  • Robust Histogram Shape-Based Method for Image Watermarking

    Publication Year: 2015 , Page(s): 717 - 729
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (6001 KB) |  | HTML iconHTML  

    Cropping and random bending are two common attacks in image watermarking. In this paper we propose a novel image-watermarking method to deal with these attacks, as well as other common attacks. In the embedding process, we first preprocess the host image by a Gaussian low-pass filter. Then, a secret key is used to randomly select a number of gray levels and the histogram of the filtered image with respect to these selected gray levels is constructed. After that, a histogram-shape-related index is introduced to choose the pixel groups with the highest number of pixels and a safe band is built between the chosen and nonchosen pixel groups. A watermark-embedding scheme is proposed to insert watermarks into the chosen pixel groups. The usage of the histogram-shape-related index and safe band results in good robustness. Moreover, a novel high-frequency component modification mechanism is also utilized in the embedding scheme to further improve robustness. At the decoding end, based on the available secret key, the watermarked pixel groups are identified and watermarks are extracted from them. The effectiveness of the proposed image-watermarking method is demonstrated by simulation examples. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Fast Trilateral Filter-Based Adaptive Support Weight Method for Stereo Matching

    Publication Year: 2015 , Page(s): 730 - 743
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3539 KB) |  | HTML iconHTML  

    Adaptive support weight (ASW) methods represent the state of the art in local stereo matching, while the bilateral filter-based ASW method achieves outstanding performance. However, this method fails to resolve the ambiguity induced by nearby pixels at different disparities but with similar colors. In this paper, we introduce a novel trilateral filter (TF)-based ASW method that remedies such ambiguities by considering the possible disparity discontinuities through color discontinuity boundaries, i.e., the boundary strength between two pixels, which is measured by a local energy model. We also present a recursive TF-based ASW method whose computational complexity is O(N) for the cost aggregation step, and O(N{\rm Log}_{2}(N)) for boundary detection, where N denotes the input image size. This complexity is thus independent of the support window size. The recursive TF-based method is a nonlocal cost aggregation strategy. The experimental evaluation on the Middlebury benchmark shows that the proposed method, whose average error rate is 4.95%, outperforms other local methods in terms of accuracy. Equally, the average runtime of the proposed TF-based cost aggregation is roughly 260 ms on a 3.4-GHz Inter Core i7 CPU, which is comparable with state-of-the-art efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Evolution of First Person Vision Methods: A Survey

    Publication Year: 2015 , Page(s): 744 - 760
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2754 KB) |  | HTML iconHTML  

    The emergence of new wearable technologies, such as action cameras and smart glasses, has increased the interest of computer vision scientists in the first person perspective. Nowadays, this field is attracting attention and investments of companies aiming to develop commercial devices with first person vision (FPV) recording capabilities. Due to this interest, an increasing demand of methods to process these videos, possibly in real time, is expected. The current approaches present a particular combinations of different image features and quantitative methods to accomplish specific objectives like object detection, activity recognition, user–machine interaction, and so on. This paper summarizes the evolution of the state of the art in FPV video analysis between 1997 and 2014, highlighting, among others, the most commonly used features, methods, challenges, and opportunities within the field. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Visual Target TRACTOR: Tracker and Detector

    Publication Year: 2015 , Page(s): 761 - 775
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (6020 KB) |  | HTML iconHTML  

    In this paper, we focus on developing a novel visual target tracking system (TRACTOR) that enables reliable operation in realistic tracking scenarios. Although much progress has been made in the field of visual target tracking, there are still challenging scenarios in which even state-of-the-art trackers do not operate reliably. For instance, most trackers are prone to drift if a target moves abruptly in unexpected directions, or reappears after being fully occluded by the clutters or disappeared from the field of view of a camera. To cope with these scenarios effectively, the proposed tracking system subdivides the task of visual target tracking into two subtasks, i.e., tracking and detection. 1) For target-visible frames, the tracker builds a collaborative framework with the proposed appearance, observation, and motion models thereby achieving robust performance against unexpected motions and appearance changes of a target and 2) for target-invisible frames, the detector verifies continuously whether a target candidate is the lost target or clutter thereby reducing false target alarms effectively. In extensive experiments, the proposed tracking system shows very promising performance in comparison with state-of-the-art tracking methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tracker-Level Fusion for Robust Bayesian Visual Tracking

    Publication Year: 2015 , Page(s): 776 - 789
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4912 KB) |  | HTML iconHTML  

    We propose a tracker-level fusion framework for robust visual tracking. The framework combines trackers addressing different tracking challenges to improve the overall performance. A novelty of the proposed framework is the inclusion of an online performance measure to identify the track quality level of each tracker so as to guide the fusion. The fusion is then based on appropriately mixing the prior state of the trackers. Moreover, the track-quality level is used to update the target appearance model. We demonstrate the framework with two Bayesian trackers on video sequences with various challenges and show its robustness compared with the independent use of the two individual trackers, and also compared with state-of-the-art trackers that use tracker-level fusion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Objective Performance Evaluation of the HEVC Main Still Picture Profile

    Publication Year: 2015 , Page(s): 790 - 797
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3224 KB) |  | HTML iconHTML  

    The first version of the High Efficiency Video Coding (HEVC) standard was approved by both ITU-T and ISO/IEC in 2013 and includes three profiles: Main and Main 10 for typical video data with 8 and 10 bits, respectively, as well as a profile referred to as Main Still Picture (MSP) profile. Apparently, the MSP profile extends the HEVC application space toward still images which, in turn, brings up the question of how this HEVC profile performs relative to existing still image coding technologies. This paper aims at addressing this question from a coding-efficiency point-of-view by presenting a rate-distortion performance analysis of the HEVC MSP profile in comparison to other popular still image and video compression schemes, including JPEG, JPEG 2000, JPEG XR, H.264/MPEG-4 AVC, VP8, VP9, and WebP. In summary, it can be stated that the HEVC MSP profile provides average bit-rate savings in the range from 10% to 44% relative to the whole set of competing video and still image compression schemes when averaged over a representative test set of photographic still images. Compared with Baseline JPEG alone, the average bit-rate saving for the HEVC MSP profile is 44%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Frequency-Domain Intra Prediction Analysis and Processing for High-Quality Video Coding

    Publication Year: 2015 , Page(s): 798 - 811
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2715 KB) |  | HTML iconHTML  

    Most of the advances in video coding technology focus on applications that require low bitrates, for example, for content distribution on a mass scale. For these applications, the performance of conventional coding methods is typically sufficient. Such schemes inevitably introduce large losses to the signal, which are unacceptable for numerous other professional applications such as capture, production, and archiving. To boost the performance of video codecs for high-quality content, better techniques are needed especially in the context of the prediction module. An analysis of conventional intra prediction methods used in the state-of-the-art High Efficiency Video Coding (HEVC) standard is reported in this paper, in terms of the prediction performance of such methods in the frequency domain. Appropriately modified encoder and decoder schemes are presented and used for this paper. The analysis shows that conventional intra prediction methods can be improved, especially for high frequency components of the signal which are typically difficult to predict. A novel approach to improve the efficiency of high-quality video coding is also presented in this paper based on such analysis. The modified encoder scheme allows for an additional stage of processing performed on the transformed prediction to replace selected frequency components of the signal with specifically defined synthetic content. The content is introduced in the signal using feature-dependent lookup tables. The approach is shown to achieve consistent gains against conventional HEVC with up to −5.2% coding gains in terms of bitrate savings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A QoE-Based Link Adaptation Scheme for H.264/SVC Video Multicast Over IEEE 802.11

    Publication Year: 2015 , Page(s): 812 - 826
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3688 KB) |  | HTML iconHTML  

    Scalable Video Coding (SVC) is an extension of H.264/Advanced Video Coding (AVC), which has good characteristics for video transmission over networks. SVC encodes a video into a base layer (BL) and multiple enhancement layers (ELs). With the degree of importance of each layer, in addition to different modulation and coding schemes (MCSs), we can assign different retry attempt limits to each layer to improve quality of experience (QoE) metrics, such as average playback bitrate and buffering ratio. For example, we can assign a slower MCS and a higher retry limit to the BL, to reduce the loss rate and maintain the playback smoothness; whereas, at the same time, we can also assign faster MCSs and lower retry limits to each of the ELs, to reduce the buffering ratio. In this paper, we present a QoE-based link adaptation (QLA) scheme for H.264/SVC video streaming over IEEE 802.11 b/g wireless LANs. We then present a multicast extension of the QLA scheme, multicast QLA (MQLA). We implemented our QLA and MQLA schemes on a Linux-based Wi-Fi protocol driver, mac80211, and built a testbed to conduct our experiments. Experiment results show that both our schemes exhibit an improved QoE performance over the default link adaptation scheme provided by the Linux wireless driver, minstrel. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • View Synthesis Distortion Estimation With a Graphical Model and Recursive Calculation of Probability Distribution

    Publication Year: 2015 , Page(s): 827 - 840
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2162 KB) |  | HTML iconHTML  

    Depth-image-based rendering (DIBR) is frequently used in multiview video applications such as free-viewpoint television. In this paper, we consider the two DIBR algorithms used in the Moving Picture Experts Group view synthesis reference software, and develop a scheme for the encoder to estimate the distortion of the synthesized virtual view at the decoder when the reference texture and depth sequences experience transmission errors such as packet loss. We first develop a graphical model to analyze how random errors in the reference depth image affect the synthesized virtual view. The warping competition rule adopted in the DIBR algorithms is explicitly represented by the graphical model. We then consider the case where packet loss occurs to both the encoded texture and depth images during transmission and develop a recursive optimal distribution estimation (RODE) method to calculate the per-pixel texture and depth probability distributions in each frame of the reference views. The RODE is then integrated with the graphical model method to estimate the distortion in the synthesized view caused by packet loss. Experimental results verify the accuracy of the graphical model method, the RODE, and the combined estimation scheme. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SIMD Acceleration for HEVC Decoding

    Publication Year: 2015 , Page(s): 841 - 855
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4186 KB) |  | HTML iconHTML  

    Single instruction multiple data (SIMD) instructions have been commonly used to accelerate video codecs. The recently introduced High Efficiency Video Coding (HEVC) codec like its predecessors is based on the hybrid video codec principle and, therefore, is also well suited to be accelerated with SIMD. In this paper we present the SIMD optimization for the entire HEVC decoder for all major SIMD instruction set architectures. Evaluation has been performed on 14 mobile and PC platforms covering most major architectures released in recent years. With SIMD, up to 5\times speedup can be achieved over the entire HEVC decoder, resulting in up to 133 and 37.8 frames/s on average on a single core for Main profile 1080p and Main10 profile 2160p sequences, respectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Deeply Pipelined CABAC Decoder for HEVC Supporting Level 6.2 High-Tier Applications

    Publication Year: 2015 , Page(s): 856 - 868
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4534 KB) |  | HTML iconHTML  

    High Efficiency Video Coding (HEVC) is the latest video coding standard that specifies video resolutions up to 8K ultra-high definition (UHD) at 120 frames/s to support the next decade of video applications. This results in high-throughput requirements for the context-adaptive binary arithmetic coding (CABAC) entropy decoder, which was already a well-known bottleneck in H.264/AVC. To address the throughput challenges, several modifications were made to CABAC during the standardization of HEVC. This paper leverages these improvements in the design of a high-throughput HEVC CABAC decoder. It also supports the high-level parallel processing tools introduced by HEVC, including tile and wavefront parallel processing. The proposed design uses a deeply pipelined architecture to achieve a high clock rate. Additional techniques such as the state prefetch logic, latched-based context memory, and separate finite state machines are applied to minimize stall cycles, while multibypass-bin decoding is used to further increase the throughput. The design is implemented in an International Business Machines 45-nm silicon on insulator process. After place and route, its operating frequency reaches 1.6 GHz. The corresponding throughputs achieve up to 1696 and 2314 Mbin/s under common and theoretical worst-case test conditions, respectively. The results show that the design is sufficient to decode in real-time high-tier video bitstreams at level 6.2 (8K UHD at 120 frames/s), or main-tier bitstreams at level 5.1 (4K UHD at 60 frames/s) for applications requiring subframe latency, such as video conferencing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Clothing Attributes Assisted Person Reidentification

    Publication Year: 2015 , Page(s): 869 - 878
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3711 KB) |  | HTML iconHTML  

    Person reidentification across nonoverlapping camera views is a rather challenging task. Due to the difficulties in obtaining identifiable faces, clothing appearance becomes the main cue for identification purposes. In this paper, we present a comprehensive study on clothing attributes assisted person reidentification. First, the body parts and their local features are extracted for alleviating the pose-misalignment issue. A latent support vector machine (LSVM)-based person reidentification approach is proposed to describe the relations among the low-level part features, middle-level clothing attributes, and high-level reidentification labels of person pairs. Motivated by the uncertainties of clothing attributes, we treat them as real-value variables instead of using them as discrete variables. Moreover, a large-scale real-world dataset with 10 camera views and about 200 subjects is collected and thoroughly annotated for this paper. The extensive experiments on this dataset show: 1) part features are more effective than features extracted from the holistic human bounding boxes; 2) the clothing attributes embedded in the LSVM model may further boost reidentification performance compared with support vector machine without clothing attributes; and 3) treating clothing attributes as real-value variables is more effective than using them as discrete variables in person reidentification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Motion-Resistant Remote Imaging Photoplethysmography Based on the Optical Properties of Skin

    Publication Year: 2015 , Page(s): 879 - 891
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5200 KB) |  | HTML iconHTML  

    Remote imaging photoplethysmography (RIPPG) can achieve contactless monitoring of human vital signs. However, the robustness to a subject’s motion is a challenging problem for RIPPG, especially in facial video-based RIPPG. The RIPPG signal originates from the radiant intensity variation of human skin with pulses of blood and motions can modulate the radiant intensity of the skin. Based on the optical properties of human skin, we build an optical RIPPG signal model in which the origins of the RIPPG signal and motion artifacts can be clearly described. The region of interest (ROI) of the skin is regarded as a Lambertian radiator and the effect of ROI tracking is analyzed from the perspective of radiometry. By considering a digital color camera as a simple spectrometer, we propose an adaptive color difference operation between the green and red channels to reduce motion artifacts. Based on the spectral characteristics of photoplethysmography signals, we propose an adaptive bandpass filter to remove residual motion artifacts of RIPPG. We also combine ROI selection on the subject’s cheeks with speeded-up robust features points tracking to improve the RIPPG signal quality. Experimental results show that the proposed RIPPG can obtain greatly improved performance in accessing heart rates in moving subjects, compared with the state-of-the-art facial video-based RIPPG methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Fast CU Size Decision Algorithm for the HEVC Intra Encoder

    Publication Year: 2015 , Page(s): 892 - 896
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (915 KB) |  | HTML iconHTML  

    Intra coding plays a crucial role in the High Efficiency Video Coding (HEVC) standard. It provides the higher coding efficiency than the previous standard, H.264/Advanced Video Coding. The block partitioning in HEVC supports quad-tree-based coding unit (CU) structure from size 64 \times 64 to 8 \times 8 . The new technique provides better performances on one hand, whereas on the other hand it also increases the coding complexity. In this paper, a novel fast algorithm is proposed for the CU size decision in intra coding. Both the global and local edge complexities in horizontal, vertical, 45° diagonal, and 135° diagonal directions are proposed and used to decide the partitioning of a CU. Coupled with handling its four sub-CUs in the same way, a CU is decided to be split, nonsplit, or undetermined for each depth. Compared with the reference software HM10.0, the encoding time is reduced by \sim 52 % on average, with \sim 0.8 % Bjontegaard Distortion-rate increasing and reasonable peak signal-to-noise ratio losses. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Circuits and Systems Society Information

    Publication Year: 2015 , Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (110 KB)  
    Freely Available from IEEE
  • Blank page

    Publication Year: 2015 , Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (3 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it