By Topic

Emerging and Selected Topics in Circuits and Systems, IEEE Journal on

Issue 2 • Date June 2013

Filter Results

Displaying Results 1 - 22 of 22
  • Table of Contents

    Publication Year: 2013 , Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (158 KB)  
    Freely Available from IEEE
  • IEEE Journal on Emerging and Selected Topics in Circuits and Systems publication information

    Publication Year: 2013 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (132 KB)  
    Freely Available from IEEE
  • Guest Editorial Computational and Smart Cameras

    Publication Year: 2013 , Page(s): 121 - 124
    Save to Project icon | Request Permissions | PDF file iconPDF (892 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Automatic Fall Detection and Activity Classification by a Wearable Embedded Smart Camera

    Publication Year: 2013 , Page(s): 125 - 136
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2071 KB) |  | HTML iconHTML  

    Robust detection of events and activities, such as falling, sitting, and lying down, is a key to a reliable elderly activity monitoring system. While fast and precise detection of falls is critical in providing immediate medical attention, other activities like sitting and lying down can provide valuable information for early diagnosis of potential health problems. In this paper, we present a fall detection and activity classification system using wearable cameras. Since the camera is worn by the subject, monitoring is not limited to confined areas, and extends to wherever the subject may go including indoors and outdoors. Furthermore, since the captured images are not of the subject, privacy concerns are alleviated. We present a fall detection algorithm employing histograms of edge orientations and strengths, and propose an optical flow-based method for activity classification. The first set of experiments has been performed with prerecorded video sequences from eight different subjects wearing a camera on their waist. Each subject performed around 40 trials, which included falling, sitting, and lying down. Moreover, an embedded smart camera implementation of the algorithm was also tested on a CITRIC platform with subjects wearing the CITRIC camera, and each performing 50 falls and 30 non-fall activities. Experimental results show the success of the proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hemispherical Multiple Camera System for High Resolution Omni-Directional Light Field Imaging

    Publication Year: 2013 , Page(s): 137 - 144
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1219 KB) |  | HTML iconHTML  

    Inspiration from light field imaging concepts has led to the construction of multi-aperture imaging systems. Using multiple cameras as individual apertures is a topic of high relevance in the light field imaging domain. The need for wide field-of-view (FOV) and high-resolution video for applications in areas of surveillance, robotics and automotive systems has driven the idea of omni-directional vision. Recently, the Panoptic camera concept has been presented that mimics the eyes of flying insects using multiple imagers. The Panoptic camera utilizes a novel methodology for constructing a spherically arranged wide FOV plenoptic imaging system whereas omni-directional image quality is limited by low resolution sensors. In this paper, a very-high resolution light field imaging and recording system inspired from the Panoptic approach is presented. Major challenges consisting of managing the huge amount of data as well as maintaining a scalable system are addressed. The proposed system is capable of recording omni-directional video at 30 fps with a resolution exceeding 9000 2400 pixels. The system is capable of capturing the surrounding light field in a FOV. This important feature opens the door to various post processing techniques such as quality-enhanced 3D cinematography, very high resolution depth map estimation and high dynamic-range applications which are beyond standard stitching and panorama generation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feature Extraction and Representation for Distributed Multi-View Human Action Recognition

    Publication Year: 2013 , Page(s): 145 - 154
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1849 KB) |  | HTML iconHTML  

    Multi-view human action recognition has gained a lot of attention in recent years for its superior performance as compared to single view recognition. In this paper, we propose a new framework for the real-time realization of human action recognition in distributed camera networks (DCNs). We first present a new feature descriptor (Mltp-hist) that is tolerant to illumination change, robust in homogeneous region and computationally efficient. Taking advantage of the proposed Mltp-hist, the noninformative 3-D patches generated from the background can be further removed automatically that effectively highlights the foreground patches. Next, a new feature representation method based on sparse coding is presented to generate the histogram representation of local videos to be transmitted to the base station for classification. Due to the sparse representation of extracted features, the approximation error is reduced. Finally, at the base station, a probability model is produced to fuse the information from various views and a class label is assigned accordingly. Compared to the existing algorithms, the proposed framework has three advantages while having less requirements on memory and bandwidth consumption: 1) no preprocessing is required; 2) communication among cameras is unnecessary; and 3) positions and orientations of cameras do not need to be fixed. We further evaluate the proposed framework on the most popular multi-view action dataset IXMAS. Experimental results indicate that our proposed framework repeatedly achieves state-of-the-art results when various numbers of views are tested. In addition, our approach is tolerant to the various combination of views and benefit from introducing more views at the testing stage. Especially, our results are still satisfactory even when large misalignment exists between the training and testing samples. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic Bayesian Network for Unconstrained Face Recognition in Surveillance Camera Networks

    Publication Year: 2013 , Page(s): 155 - 164
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2449 KB) |  | HTML iconHTML  

    The demand for robust face recognition in real-world surveillance cameras is increasing due to the needs of practical applications such as security and surveillance. Although face recognition has been studied extensively in the literature, achieving good performance in surveillance videos with unconstrained faces is inherently difficult. During the image acquisition process, the noncooperative subjects appear in arbitrary poses and resolutions in different lighting conditions, together with noise and blurriness of images. In addition, multiple cameras are usually distributed in a camera network and different cameras often capture a subject in different views. In this paper, we aim at tackling this unconstrained face recognition problem and utilizing multiple cameras to improve the recognition accuracy using a probabilistic approach. We propose a dynamic Bayesian network to incorporate the information from different cameras as well as the temporal clues from frames in a video sequence. The proposed method is tested on a public surveillance video dataset with a three-camera setup. We compare our method to different benchmark classifiers with various feature descriptors. The results demonstrate that by modeling the face in a dynamic manner the recognition performance in a multi-camera network is improved over the other classifiers with various feature descriptors and the recognition result is better than using any of the single camera. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Action-Based Multi-Camera Synchronization

    Publication Year: 2013 , Page(s): 165 - 174
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1977 KB) |  | HTML iconHTML  

    We propose a video alignment method based on observing the actions of a set of articulated objects. Given object association information, the proposed video synchronization method is applicable to general and unconstrained scenarios in a way that is not feasible with current state-of-the-art approaches: the proposed method does not impose constraints on the relative pose or motion of the cameras, on the structure of the time warping between the videos and on the amount of overlap among the fields-of-view. The proposed method uses a high-level video analysis (object actions) and models the alignment as a frame association problem (as opposed to the traditional continuous time warping). We present a qualitative and quantitative analysis of the results in real-world complex scenarios, showing the robustness of the method and higher accuracy compared to the only approach from the literature that works under similar conditions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Geometry-Based Object Association and Consistent Labeling in Multi-Camera Surveillance

    Publication Year: 2013 , Page(s): 175 - 184
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1359 KB) |  | HTML iconHTML  

    This paper proposes a multi-camera surveillance framework based on multiple view geometry. We address the problem of object association and consistent labeling through exploring geometrical correspondences of objects, not only in sequential frames from a single camera view but also across multiple camera views. The cameras are geometrically related through joint combination of multi-camera calibration, ground plane homography constraint, and field-of-view lines. Object detection is implemented using an adaptive Gaussian mixture model, and thereafter the information obtained from different cameras is fused so that the same object shown in different views can be assigned a unique label. Meanwhile, a virtual top-view of ground plane is synthesized to explicitly display the corresponding location and label of each detected object within a designated area-of-interest. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-View ML Object Tracking With Online Learning on Riemannian Manifolds by Combining Geometric Constraints

    Publication Year: 2013 , Page(s): 185 - 197
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3294 KB) |  | HTML iconHTML  

    This paper addresses issues in object tracking with occlusion scenarios, where multiple uncalibrated cameras with overlapping fields of view are exploited. We propose a novel method where tracking is first done independently in each individual view and then tracking results are mapped from different views to improve the tracking jointly. The proposed tracker uses the assumptions that objects are visible in at least one view and move uprightly on a common planar ground that may induce a homography relation between views. A method for online learning of object appearances on Riemannian manifolds is also introduced. The main novelties of the paper include: 1) define a similarity measure, based on geodesics between a candidate object and a set of mapped references from multiple views on a Riemannian manifold; 2) propose multi-view maximum likelihood estimation of object bounding box parameters, based on Gaussian-distributed geodesics on the manifold; 3) introduce online learning of object appearances on the manifold, taking into account of possible occlusions; 4) utilize projective transformations for objects between views, where parameters are estimated from warped vertical axis by combining planar homography, epipolar geometry, and vertical vanishing point; 5) embed single-view trackers in a three-layer multi-view tracking scheme. Experiments have been conducted on videos from multiple uncalibrated cameras, where objects contain long-term partial/full occlusions, or frequent intersections. Comparisons have been made with three existing methods, where the performance is evaluated both qualitatively and quantitatively. Results have shown the effectiveness of the proposed method in terms of robustness against tracking drift caused by occlusions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of Wireless Vision Sensor Node With a Lightweight Bi-Level Video Coding

    Publication Year: 2013 , Page(s): 198 - 209
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1451 KB) |  | HTML iconHTML  

    Wireless vision sensor networks (WVSNs) consist of a number of wireless vision sensor nodes (VSNs) which have limited resources i.e., energy, memory, processing, and wireless bandwidth. The processing and communication energy requirements of individual VSN have been a challenge because of limited energy availability. To meet this challenge, we have proposed and implemented a programmable and energy efficient VSN architecture which has lower energy requirements and has a reduced design complexity. In the proposed system, vision tasks are partitioned between the hardware implemented VSN and a server. The initial data dominated tasks are implemented on the VSN while the control dominated complex tasks are processed on a server. This strategy will reduce both the processing energy consumption and the design complexity. The communication energy consumption is reduced by implementing a lightweight bi-level video coding on the VSN. The energy consumption is measured on real hardware for different applications and proposed VSN is compared against published systems. The results show that, depending on the application, the energy consumption can be reduced by a factor of approximately 1.5 up to 376 as compared to VSN without the bi-level video coding. The proposed VSN offers energy efficient, generic architecture with smaller design complexity on hardware reconfigurable platform and offers easy adaptation for a number of applications as compared to published systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • P-FAD: Real-Time Face Detection Scheme on Embedded Smart Cameras

    Publication Year: 2013 , Page(s): 210 - 222
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2559 KB) |  | HTML iconHTML  

    Face detection on general embedded devices is fundamentally different from the conventional approach on personal computer or consumer digital camera due to the limited computation and power capacity. The resource-limited characteristic gives rise to new challenges for implementing a real-time video surveillance system with smart cameras. In this work, we present the design and implementation of Pyramid-like FAce Detection (P-FAD), a real-time face detection system constructed on general embedded devices. Motivated by the observation that the computation overhead increases proportionally to its pixel manipulation, P-FAD proposes a hierarchical approach to shift the complex computation to the promising regions. More specifically, P-FAD present a three-stage coarse, shift, and refine procedure, to construct a pyramid-like detection framework for reducing the computation overhead significantly. This framework also strikes a balance between the detection speed and accuracy. We have implemented P-FAD on notebook, Android phone and our embedded smart camera platform. An extensive system evaluation in terms of detailed experimental and simulation results is provided. Our empirical evaluation shows that P-FAD outperforms V-J detector calibrated color detector (VJ-CD) and color detector followed by a V-J detector (CD-VJ), the state of the art real-time face detection techniques by 4.7 -8.6 on notebook and by up to 8.2 on smart phone in terms of thembedded smart camera platfoe detection speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multimodal Video Analysis on Self-Powered Resource-Limited Wireless Smart Camera

    Publication Year: 2013 , Page(s): 223 - 235
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3698 KB) |  | HTML iconHTML  

    Surveillance is one of the most promising applications for wireless sensor networks, stimulated by a confluence of simultaneous advances in key disciplines: computer vision, image sensors, embedded computing, energy harvesting, and sensor networks. However, computer vision typically requires notable amounts of computing performance, a considerable memory footprint and high power consumption. Thus, wireless smart cameras pose a challenge to current hardware capabilities in terms of low-power consumption and high imaging performance. For this reason, wireless surveillance systems still require considerable amount of research in different areas such as mote architectures, video processing algorithms, power management, energy harvesting and distributed engine. In this paper, we introduce a multimodal wireless smart camera equipped with a pyroelectric infrared sensor and solar energy harvester. The aim of this work is to achieve the following goals: 1) combining local processing, low power hardware design, power management and energy harvesting to develop a low-power, low-cost, power-aware, and self-sustainable wireless video sensor node for video processing on board; 2) develop an energy efficient smart camera with high accuracy abandoned/removed object detection capability. The efficiency of our approach is demonstrated by experimental results in terms of power consumption and video processing accuracy as well as in terms of self-sustainability. Finally, simulation results show how perpetual work can be achieved in an outdoor scenario within a typical video surveillance application dealing with abandoned/removed object detection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterizing a Heterogeneous System for Person Detection in Video Using Histograms of Oriented Gradients: Power Versus Speed Versus Accuracy

    Publication Year: 2013 , Page(s): 236 - 247
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1747 KB) |  | HTML iconHTML  

    This paper presents a new implementation, with complete analysis, of the processing operations required in a widely-used pedestrian detection algorithm (the histogram of oriented gradients (HOG) detector) when run in various configurations on a heterogeneous platform suitable for use as an embedded system. The platform consists of field-programmable gate array (FPGA), graphics processing unit (GPU), and central processing unit (CPU) and we detail the advantages of such an image processing system for real-time performance. We thoroughly analyze the consequent tradeoffs made between power consumption, latency and accuracy for each possible configuration. We thus demonstrate that prioritization of each of these factors can be made by selecting a specific configuration. These separate configurations may then be changed dynamically to respond to changing priorities of a real-time system, e.g., on a moving vehicle. We compare the performance of real-time implementations of linear and kernel support vector machines in HOG and evaluate the entire system against the state-of-the-art in real-time person detection. We also show that our FPGA implementation detects pedestrians more accurately than existing implementations, and that a heterogeneous configuration which performs image scaling on the GPU, and histogram extraction and classification on the FPGA, produces a good compromise between power and speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Hardware/Software Design and Implementation of a High Definition Multiview Video Surveillance System

    Publication Year: 2013 , Page(s): 248 - 262
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2573 KB) |  | HTML iconHTML  

    This paper proposes a distributed architecture for high definition multiview video surveillance system. It adopts a modular design where single view/stereo intelligent internet protocol (IP)-based video surveillance cameras are connected to a front-end field programmable gate array (FPGA) board(s) which are connected to a back-end local video server through the IP network. The data intensive video analytics (VA) algorithms such as background modeling, connected component labeling and single view object tracking are implemented in the FPGA using an efficient fix-point based architecture. Each back-end video server is equipped with a storage and graphics processing units for supporting high-level VA and other processing algorithms such as video decompression/display, mean depth estimation and consistent labeling. A real-time prototype system was constructed to illustrate the architecture and VA algorithms involved. Satisfactory results were obtained for both publicly available data set and real surveillance video data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-Time People Tracking in a Camera Network

    Publication Year: 2013 , Page(s): 263 - 271
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1952 KB) |  | HTML iconHTML  

    We present an approach to track several subjects from video sequences acquired by multiple cameras in real time. We address the key concerns of real time performance and continuity of tracking in overlapping and nonoverlapping fields of view. Each human subject is represented by a parametric ellipsoid having a state vector that encodes its position, velocity and height. We also encode visibility and persistence to tackle problems of distraction and short-period occlusion. To improve likelihood computation from different viewpoints, including the relocation of subjects after network blind spots, the colored and textured surface of each ellipsoid is learned progressively as the subject moves through the scene. This is combined with the information about subject position and velocity to perform camera handoff. For real time performance, the boundary of the ellipsoid can be projected several hundred times per frame for comparison with the observation image. Further, our implementation employs a particle filter, developed for parallel implementation on a graphics processing unit. We have evaluated our algorithm on standard data sets using metrics for multiple object tracking accuracy (MOTA) and speed of processing, and can show significant improvements in comparison with published work. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Accurate Algorithm for the Identification of Fingertips Using an RGB-D Camera

    Publication Year: 2013 , Page(s): 272 - 283
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1887 KB) |  | HTML iconHTML  

    RGB-D cameras and depth sensors have made possible the development of an uncountable number of applications in the field of human-computer interactions. Such applications, varying from gaming to medical, have made possible because of the capability of such sensors of elaborating depth maps of the placed ambient. In this context, aiming to realize a sound basis for future applications relevant to the movement and to the pose of hands, we propose a new approach to recognize fingertips and to identify their position by means of the Microsoft Kinect technology. The experimental results exhibit a really good identification rate, an execution speed faster than the frame rate with no meaningful latencies, thus allowing the use of the proposed system in real time applications. Furthermore, the scored identification accuracy confirms the excellent capability of following also little movements of the hand and it encourages the real possibility of successive implementations in more complex gesture recognition systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software Laboratory for Camera Networks Research

    Publication Year: 2013 , Page(s): 284 - 293
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1645 KB) |  | HTML iconHTML  

    We present a distributed virtual vision simulator capable of simulating large-scale camera networks. Our virtual vision simulator is capable of simulating pedestrian traffic in different 3D environments. Simulated cameras deployed in these virtual environments generate synthetic video feeds that are fed into a vision processing pipeline supporting pedestrian detection and tracking. The visual analysis results are then used for subsequent processing, such as camera control, coordination, and handoff. Our virtual vision simulator is realized as a collection of modules that communicate with each other over the network. Consequently, we can deploy our simulator over a network of computers, allowing us to simulate much larger camera networks and much more complex scenes then is otherwise possible. Specifically, we show that our proposed virtual vision simulator can model a camera network, comprising more than one hundred active pan/tilt/zoom and passive wide field-of-view cameras, deployed in an upper floor of an office tower in downtown Toronto. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Journal on Emerging and Selected Topics in Circuits and Systems information for authors

    Publication Year: 2013 , Page(s): 294
    Save to Project icon | Request Permissions | PDF file iconPDF (112 KB)  
    Freely Available from IEEE
  • Technology insight on demand on IEEE.tv

    Publication Year: 2013 , Page(s): 295
    Save to Project icon | Request Permissions | PDF file iconPDF (1519 KB)  
    Freely Available from IEEE
  • Together, we are advancing technology

    Publication Year: 2013 , Page(s): 296
    Save to Project icon | Request Permissions | PDF file iconPDF (399 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Publication Year: 2013 , Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (117 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Journal on Emerging and Selected Topics in Circuits and Systems publishes special issues covering the entire Field of Interest of the IEEE Circuits and Systems Society and with particular focus on emerging areas.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Manuel Delgado-Restituto
Instituto Nacional de Microelectrónica de Sevilla
IMSE-CNM (CSIC/Universidad de Sevilla)