Notification:
We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on

Date 20-25 June 2011

Filter Results

Displaying Results 1 - 25 of 146
  • Textured 3D face recognition using biological vision-based facial representation and optimized weighted sum fusion

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (796 KB) |  | HTML iconHTML  

    This paper proposes a novel biological vision-based facial description, namely Perceived Facial Images (PFIs), aiming to highlight intra-class and inter-class variations of both facial range and texture images for textured 3D face recognition. These generated PFIs simulate the response of complex neurons to gradient information within a certain neighborhood and possess the properties of being highly distinctive and robust to affine illumination and geometric transformation. Based on such an intermediate facial representation, SIFT-based matching is further carried out to calculate similarity scores between a given probe face and the gallery ones. Because the facial description generates a PFI for each quantized gradient orientation of range and texture faces, we then propose a score level fusion strategy which optimizes the weights using a genetic algorithm in a learning step. Evaluated on the entire FRGC v2.0 database, the rank-one recognition rate using only 3D or 2D modality is 95.5% and 95.9%, respectively; while fusing both modalities, i.e. range and texture-based PFIs, the final accuracy is 98.0%, demonstrating the effectiveness of the proposed biological vision-based facial description and the optimized weighted sum fusion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of patterns of motor behavior in gamers with down syndrome

    Publication Year: 2011 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (430 KB) |  | HTML iconHTML  

    This paper reports on the computer vision-based analysis of weight-shifting patterns in Down Syndrome subjects. The subjects were filmed while playing a snowboarding game in a virtual reality environment. To capture local changes in posture during motion performance, we introduce the concept of parabolic bounding box. This concept aims at capturing information about the lateral curvature of the human body in motion. This information is aggregated into histograms of motion curvatures. This novel motion representation is useful for differentiating between normal and abnormal weight shifting patterns, and for presenting subject specific information about the frequency of these patterns over the entire duration of the game. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tracking through scattered occlusion

    Publication Year: 2011 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (910 KB) |  | HTML iconHTML  

    Scattered occlusion is an occlusion that is not localized in space or time. It occurs because of heavy smoke, rain, snow and fog, as well as tree branches and leafs, or any other thick flora for that matter. As a result, we can not assume that there is correlation in the visibility of nearby pixels. We propose a new tracker, dubbed Scatter Tracker that can efficiently deal with this type of occlusion. Our tracker is based on a new similarity measure between images that combines order statistics with a spatial prior that forces the order statistics to work on non-overlapping patches. We analyze the probability of detection, and false detection, of our tracker and show that it can be modeled as a sequence of independent Bernoulli trials on pixel similarity. In addition, to handle appearance variations of the tracked target, an appearance model update scheme based on incremental-PCA procedure is incorporated into the tracker. We show that the combination of order statistics and spatial prior greatly enhances the quality of our tracker and demonstrate its effectiveness on a number of challenging video sequences. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dense shape correspondences using spectral high-order graph matching

    Publication Year: 2011 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1344 KB) |  | HTML iconHTML  

    This paper addresses the problem of establishing point correspondences between two object instances using spectral high-order graph matching. Therefore, 3D objects are intrinsically represented by weighted high-order adjacency tensors. These are, depending on the weighting scheme, invariant for structure-preserving, equi-areal, conformal or volume-preserving object deformations. Higher-order spectral decomposition transforms the NP-hard assignment problem into a linear assignment problem by canonical embedding. This allows to extract dense correspondence information with reasonable computational complexity, making the method faster than any other previously published method imposing higher-order constraints to shape matching. Robustness against missing data and resampling is measured and compared with a baseline spectral graph matching method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Silhouette-based features for visible-infrared registration

    Publication Year: 2011 , Page(s): 68 - 73
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (186 KB) |  | HTML iconHTML  

    We study the registration problem for infrared-visible stereo pairs. Given the properties of infrared and visible images that make them mostly similar near boundaries, we propose a method to extract keypoints on the boundary and on the skeleton of a region of interest (ROI). We show that our keypoints may be applied for partial image ROI and global registration either for videos or for still images given that the ROI silhouette is detected. In experiments, we show that our method gives better results than other classic key-points and it gives results that are close to a state-of-the-art global registration trajectory-based method that uses temporal information. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Microscopic shape from focus with optimal illumination

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2855 KB) |  | HTML iconHTML  

    We present a novel method for compensating illumination artifacts in shape from focus reconstruction that does not require additional measurement time. Frequently applied in optical microscopy, shape from focus requires rich surface texture over the whole scene. This prerequisite is violated in saturated image regions. To overcome this limitation, we automatically compensate for the scene reflectance by means of a projector-camera system on a per-image basis. We iteratively adapt illumination in the first image and consecutively track the compensation pattern through the image stack to reduce measurement time. Despite the low measurement time, our experiments show that our method outperforms the standard shape from focus approach and is also superior to competing methods like high dynamic range imaging in terms of measurement time and accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mesh-based global motion compensation for robust mosaicking and detection of moving objects in aerial surveillance

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2575 KB) |  | HTML iconHTML  

    Global Motion Compensation is one of the key technologies for aerial image processing e.g. to detect moving objects on the ground or to generate a mosaick image of the observed area. For this task, it is necessary to estimate and compensate the motion of the pixels between the recorded frames evoked by the movement of the camera. As the camera is statically attached to a flying device such as a quadro-copter (also called Micro Air Vehicle, MAV) or a helicopter, the motion of the camera directly corresponds to the plane movements. For simplification, only a planar landscape model is used nowadays to describe the global motion of the scene. However, if objects like buildings or mountains are close to the camera, i.e. the MAV is at a low altitude, this simplification is not valid. Therefore we propose a more complex model by introducing a 2D mesh-based motion compensation technique, also known as image warping, to compensate the global motion. We show the benefits if used for mosaick creation by smaller artifacts due to perspective distortions and smaller drift problems. We also improve a moving object detection system to identify moving objects more reliably. Moreover, the proposed method is also more robust in case of lens distortions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A unified CSF-based framework for edge detection and edge visibility

    Publication Year: 2011 , Page(s): 21 - 26
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (834 KB) |  | HTML iconHTML  

    One important trend in edge detection starts from current knowledge about the Human Visual System (HVS) in order to mimic some of its components. We follow this trend and propose a new edge detector which also computes the edge visibility for the HVS. Two important processes in the HVS are taken into account: visual adaptation and contrast sensitivity. Our model is in good agreement with some classical results in human vision, such as Weber's law and Ricco's law, as well as the visibility of sine and square gratings. The main contribution is to propose a unified framework, biologically inspired, which mimics human vision and computes both edge localization and edge visibility. Compared to previous approaches, the visibility of a target is estimated without segmentation of the target. This work may contribute to military applications, Intelligent Transportation Systems, and low vision simulation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A real-time system for 3D recovery of dynamic scene with multiple RGBD imagers

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (9873 KB) |  | HTML iconHTML  

    In this paper we present a real-time 3D recovery system using multiple FPGA-based RGBD imagers. The RGBD imager developed in our lab produces color images combined with corresponding dense disparity maps encoding 3D information. Multiple RGBD imagers are externally triggered to sense the 3D world synchronously. The acquired 3D information of a dynamic scene from multiple viewpoints is then streamed to a PC cluster for further processing. Multiple RGBD imagers are fully calibrated to normalize the 3D data into a uniform world coordinate system. Compared with most of the visual hull based real-time 3D reconstruction systems, our system relies on depth maps and is much more suited for large scale dynamic scenes with multiple moving objects (e.g. people). Examples are provided to demonstrate the effectiveness of our system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Advances in phonetics-based sub-unit modeling for transcription alignment and sign language recognition

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1325 KB) |  | HTML iconHTML  

    We explore novel directions for incorporating phonetic transcriptions into sub-unit based statistical models for sign language recognition. First, we employ a new symbolic processing approach for converting sign language annotations, based on HamNoSys symbols, into structured sequences of labels according to the Posture-Detention-Transition-Steady Shift phonetic model. Next, we exploit these labels, and their correspondence with visual features to construct phonetics-based statistical sub-unit models. We also align these sequences, via the statistical sub-unit construction and decoding, to the visual data to extract time boundary information that they would lack otherwise. The resulting phonetic sub-units offer new perspectives for sign language analysis, phonetic modeling, and automatic recognition. We evaluate this approach via sign language recognition experiments on an extended Lemmas Corpus of Greek Sign Language, which results not only in improved performance compared to pure data-driven approaches, but also in meaningful phonetic sub-unit models that can be further exploited in interdisciplinary sign language analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automated segmentation of iris images using visible wavelength face images

    Publication Year: 2011 , Page(s): 9 - 14
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1742 KB) |  | HTML iconHTML  

    Remote human identification using iris biometrics requires the development of automated algorithm of the robust segmentation of iris region pixels from visible face images. This paper presents a new automated iris segmentation framework for iris images acquired at-a-distance using visible imaging. The proposed approach achieves the segmentation of iris region pixels in two stages, i.e. (i) iris and sclera classification, and (ii) post-classification processing. Unlike the traditional edge-based segmentation approaches, the proposed approach simultaneously exploits the discriminative color features and localized Zernike moments to perform pixel-based classification. Rigorous experimental results presented in this paper confirm the usefulness of the proposed approach and achieve improvement of 42.4% in the average segmentation errors, on UBIRIS.v2 dataset, as compared to the previous approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Wii Remote calibration using the sensor bar

    Publication Year: 2011 , Page(s): 7 - 12
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (235 KB) |  | HTML iconHTML  

    Wii Remote is the standard controller of the game console Nintendo Wii®. Despite its low cost, it has a very high performance and high resolution infrared camera. It also has a built-in chip for tracking up to 4 points from the infrared images. With these properties, the Wii Remote has attracted several attempts to calibrate and use them for metric 3D information. Thanks to Wii Remote's 4 point tracking hardware, people were able to calibrate Wii Remotes using carefully designed square patterns with infrared LEDs at each corner. However, it is not easy for average Wii Remote users to implement such calibration tools. In this paper, we give an overview of the current Wii Remote calibration implementations and present a novel method to fully calibrate multiple Wii Remotes using only two points, such as the LEDs of the Sensor Bar that comes with the Wii console. For the 3D reconstruction, only two Wii Remotes and a PC is needed. We also demonstrate the usability of a full 3D tracker with a multiplayer game that receives inputs from up to 4 users with only two Wii Remotes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temperature distribution descriptor for robust 3D shape retrieval

    Publication Year: 2011 , Page(s): 9 - 16
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4449 KB) |  | HTML iconHTML  

    Recent developments in acquisition techniques are resulting in a very rapid growth of the number of available three dimensional (3D) models across areas as diverse as engineering, medicine and biology. It is therefore of great interest to develop the efficient shape retrieval engines that, given a query object, return similar 3D objects. The performance of a shape retrieval engine is ultimately determined by the quality and characteristics of the shape descriptor used for shape representation. In this paper, we develop a novel shape descriptor, called temperature distribution (TD) descriptor, which is capable of exploring the intrinsic geometric features on the shape. It intuitively interprets the shape in an isometrically-invariant, shape-aware, noise and small topological changes insensitive way. TD descriptor is driven by by heat kernel. The TD descriptor understands the shape by evaluating the surface temperature distribution evolution with time after applying unit heat at each vertex. The TD descriptor is represented in a concise form of a one dimensional (1D) histogram, and captures enough information to robustly handle the shape matching and retrieval process. Experimental results demonstrate the effectiveness of TD descriptor within applications of 3D shape matching and searching for the models at different poses and various noise levels. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Where is the rat? Tracking in low contrast thermographic images

    Publication Year: 2011 , Page(s): 55 - 60
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB) |  | HTML iconHTML  

    This paper presents a method to track an animal in low-contrast thermographic images in order to obtain its body temperature. This work was done in the context of the study of atypical febrile seizures. To solve this tracking problem, we propose a method based on morphological operations on the area to track using regions resulting from consecutive frame differences. A Gaussian model is then used to classify tracked area pixels into animal and background pixels to further remove outliers. The temperature of the animal is taken as the mean of the tracked area. Experimental results show that we obtain, in general, temperature estimation within 1°C from ground-truth for videos as long as 16000 frames. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Projection defocus correction using adaptive kernel sampling and geometric correction in dual-planar environments

    Publication Year: 2011 , Page(s): 9 - 14
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1392 KB) |  | HTML iconHTML  

    Defocus blur correction for projectors using a camera is useful when the projector is used in ad hoc environments. However, past literature has not explicitly considered the common situation when the projection surface includes a corner made up of two planar surfaces that abut each other, such as the ubiquitous office cubicle. In this paper, we advance the state of the art by demonstrating defocus correction in a non-parametric setting. Our method differs from prior methods in that (a) the luminance and chrominance channels are independently considered, and (b) a sparse sampling of the surface is used to discover the spatially varying defocus kernel. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low bit rate ROI based video coding for HDTV aerial surveillance video sequences

    Publication Year: 2011 , Page(s): 13 - 20
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (594 KB) |  | HTML iconHTML  

    For aerial surveillance systems two key features are important. First they have to provide as much resolution as possible, while they secondly should make the video available at a ground station as soon as possible. Recently so called Unmanned Aerial Vehicles (UAVs) got in the focus for surveillance operations with operation targets such as environmental and disaster area monitoring as well as military surveillance. Common transmission channels for UAVs are only available with small bandwidths of a few Mbit/s. In this paper we propose a video codec which is able to provide full HDTV (1920 × 1080 pel) resolution with a bit rate of about 1-3 Mbit/s including moving objects (instead of 8-15 Mbit/s when using the standardized AVC codec). The coding system is based on an AVC video codec which is controlled by ROI detectors. Furthermore we make use of additional Global Motion Compensation (GMC). In a modular concept different Region of Interest (ROI) detectors can be added to adjust the coding system to special operation targets. This paper presents a coding system with two motion-based ROI detectors; one for new area detection (ROI-NA) and another for moving objects (ROI-MO). Our system preserves more details than an AVC coder at the same bit rate of 1.0 Mbit/s for the entire frame. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Spatio-chromatic decorrelation by shift-invariant filtering

    Publication Year: 2011 , Page(s): 27 - 34
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1624 KB) |  | HTML iconHTML  

    In this paper we derive convolutional filters for colour image whitening and decorrelation. Whilst whitening can be achieved via eigendecomposition of the image patch co-variance, this operation is neither efficient nor biologically plausible. Given the shift invariance of image statistics, the covariance matrix contains repeated information which can be eliminated by solving directly for a per pixel linear operation (convolution). We formulate decorrelation as a shift and rotation invariant filtering operation and solve directly for the filter shape via non-linear least squares. This results in opponent-colour lateral inhibition filters which resemble those found in the human visual system. We also note the similarity of these filters to current interest point detectors, and perform an experimental evaluation of their use in this context. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning multi-view correspondences from temporal coincidences

    Publication Year: 2011 , Page(s): 9 - 16
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7956 KB) |  | HTML iconHTML  

    We propose a new learning approach to determine the geometric and photometric relationship between multiple cameras which have at least partially overlapping fields of view. The essential difference to standard matching techniques is that the search for similar spatial patterns is replaced by an analysis of temporal coincidences of single pixels. This analysis is located on a very low level in the processing hierarchy, since it is hypothesized to be a primary feature of visual perception, useful also for technical vision systems. The proposed scheme yields an array of probability distributions that represent the geometrical structure of these correspondences for arbitrary relative orientations of the cameras, arbitrary imaging geometry (perspective, cata-dioptric, etc.), and under large tolerance for photometric differences in the image sensors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deciphering the face

    Publication Year: 2011 , Page(s): 7 - 12
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2567 KB) |  | HTML iconHTML  

    We argue that to make robust computer vision algorithms for face analysis and recognition, these should be based on configural and shape features. In this model, the most important task to be solved by computer vision researchers is that of accurate detection of facial features, rather than recognition. We base our arguments on recent results in cognitive science and neuroscience. In particular, we show that different facial expressions of emotion have diverse uses in human behavior/cognition and that a facial expression may be associated to multiple emotional categories. These two results are in contradiction with the continuous models in cognitive science, the limbic assumption in neuroscience and the multidimensional approaches typically employed in computer vision. Thus, we propose an alternative hybrid continuous-categorical approach to the perception of facial expressions and show that configural and shape features are most important for the recognition of emotional constructs by humans. We illustrate how these image cues can be successfully exploited by computer vision algorithms. Throughout the paper, we discuss the implications of these results in applications in face recognition and human-computer interaction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using eye gaze, head pose, and facial expression for personalized non-player character interaction

    Publication Year: 2011 , Page(s): 13 - 18
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB) |  | HTML iconHTML  

    True immersion of a user within a game is only possible when the world simulated looks and behaves as close to reality as possible. This implies that the game must ascertain, among other things, the user's focus and his/her attitude towards the object or person focused on. As part of the effort to achieve this goal, we propose an eye gaze, head pose, and facial expression system for use in real-time games. Both the eye gaze and head pose components utilize underlying 3D models, while the expression recognition module uses the effective but efficient LBP-TOP approach. We then demonstrate the utility of this system in a test application wherein the user looks at one of three non-player characters (NPC) and performs one of the 7 prototypic expressions; the NPC responds based on its personality. To increase the speed and efficiency of the system, the eye gaze and expression recognition modules leverage CUDA and GLSL pixel shaders. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient nonlinear DTI registration using DCT basis functions

    Publication Year: 2011 , Page(s): 17 - 22
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (801 KB) |  | HTML iconHTML  

    In this paper a nonlinear registration algorithm for diffusion tensor (DT) MR images is proposed. The nonlinear deformation is modeled using a combination of Discrete Cosine Transformation (DCT) basis functions thus reducing the number of parameters that need to be estimated. This approach was demonstrated to be an effective method for scalar image registration via SPM, and we show here how it can be extended to tensor images. The proposed approach employs the full tensor information via a Euclidean distance metric. Tensor reorientation is explicitly determined from the nonlinear deformation model and applied during the optimization process. We evaluate the proposed approach both quantitatively and qualitatively and show that it results in improved performance in terms of trace error and Euclidean distance error when compared to a tensor registration method (DTI-TK). The computational efficiency of the proposed approach is also evaluated and compared. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Prototyping a light field display involving direct observation of a video projector array

    Publication Year: 2011 , Page(s): 15 - 20
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (9418 KB) |  | HTML iconHTML  

    We present a concept for a full-parallax light field display achieved by having users look directly into an array of video projectors. Each projector acts as one angularly-varying pixel, so the display's spatial resolution depends on the number of video projectors and the angular resolution depends on the pixel resolution of any one video projector. We prototype a horizontal-parallax-only arrangement by mechanically moving a single pico-projector to an array of positions, and use long-exposure photography to simulate video of a horizontal array. With this setup, we determine the minimal projector density required to produce a continuous image, and describe practical ways to achieve such density and to realize the resulting system. We finally show that if today's pico-projectors become sufficiently inexpensive, immersive full-parallax displays with arbitrarily high spatial and angular resolution will become possible. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Prioritized data transmission in airborne camera networks for wide area surveillance and image mosaicking

    Publication Year: 2011 , Page(s): 17 - 24
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5779 KB) |  | HTML iconHTML  

    Unmanned aerial vehicles (UAVs) are an emerging research area and we equip these with high resolution cameras and build a wireless network for wide area surveillance. The sensed telemetry data and images are processed on-board in a distributed manner to generate an orthographic mosaick and augment with sensed data. In this work we present a prioritized data transmission scheme for a wireless network of mobile aerial camera nodes for wide area surveillance. The goal of this protocol is to transfer the telemetry data, mosaicking data and images efficiently over the limited wireless network such that an overview image can be generated incrementally. Our experiments with up to four UAVs demonstrate very short delays for the final mosaick, due to the prioritization by the network protocol. Low resolution image data and meta data for mosaicking is prioritized over the full sized image data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Segmentation-robust representations, matching, and modeling for sign language

    Publication Year: 2011 , Page(s): 13 - 19
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (618 KB) |  | HTML iconHTML  

    Distinguishing true signs from transitional, extraneous movements as the signer moves from one sign to the next is a serious hurdle in the design of continuous sign language recognition systems. This problem is further compounded by the ambiguity of segmentation and occlusions. This short paper provides an overview of our experience with representations and matching methods, particularly those that can handle errors in low-level segmentation and uncertainties of sign boundaries in sentences. We have formulated a novel framework that can address both these problems using a nested, level-building based dynamic programming approach that works for matching two instances of signs as well as for matching an instance to an abstracted statistical model in the form of a Hidden Markov Model (HMM). We also present our approach to sign recognition that does not need hand tracking over frames, but rather abstracts and uses the global configuration of low-level features from hands and faces. These global representations are used not only for recognition, but also to extract and to automatically learn models of signs from continuous sentences in a weakly unsupervised manner. Our publications that discuss these issues and solutions in more detail can be found at http://marathon.csee.usf.edu/ASL/. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning human behaviour patterns in work environments

    Publication Year: 2011 , Page(s): 47 - 52
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1298 KB) |  | HTML iconHTML  

    In this paper, we propose a flexible, human-oriented framework for learning the behaviour pattern of the users in work environments from visual sensors. The knowledge of human behaviour pattern enables the ambient environment to communicate with the user in a seamless way and make anticipatory decisions, from the automation of appliances and personal schedule reminder to the detection of unhealthy habits. Our learning method is general and learns from a set of activity sequences, where the granularity of activities can vary for different applications. Algorithms to extract the activity information from the videos are described. We evaluate our method on video sequences captured in a real office, where the user's daily routine is recorded over a month. The results show that our approach is capable of not only identifying the frequent behaviour of the user, but also the time relations and conditions of the activities. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.