By Topic

Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on

Date 24-26 March 2014

Filter Results

Displaying Results 1 - 25 of 153
  • A discriminative parts based model approach for fiducial points free and shape constrained head pose normalisation in the wild

    Publication Year: 2014 , Page(s): 1 - 2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (148 KB) |  | HTML iconHTML  

    Continuous Confidence Map Based Normalisation: While continuous head pose normalisation is not the goal of this paper, we demonstrate as a proof of concept that it is possible to extend the current method for continuous head pose normalisation. For dealing with faces in videos [1], continuous head pose normalisation is required. [2] argue that the appearance of a part does not changes with a subtle pose change, therefore a detector for part i in pose angle p can be shared for the same part i for a pose angle p + δ. Further experiments in [2] showed that sharing based models and independent model have comparable performance. However, sharing based models are faster upto ten times as compared to the independent models [2]. The confidence maps based methods (CM-HPNPS and CM-HPNPI) can be extended from discrete to continuous by sharing part-specific regression models R, which are shared among neighboring pose angles. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data-driven road detection

    Publication Year: 2014 , Page(s): 1134 - 1141
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2458 KB) |  | HTML iconHTML  

    In this paper, we tackle the problem of road detection from RGB images. In particular, we follow a data-driven approach to segmenting the road pixels in an image. To this end, we introduce two road detection methods: A top-down approach that builds an image-level road prior based on the traffic pattern observed in an input image, and a bottom-up technique that estimates the probability that an image superpixel belongs to the road surface in a nonparametric manner. Both our algorithms work on the principle of label transfer in the sense that the road prior is directly constructed from the ground-truth segmentations of training images. Our experimental evaluation on four different datasets shows that this approach outperforms existing top-down and bottom-up techniques, and is key to the robustness of road detection algorithms to the dataset bias. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Segmentation and tracking of partial planar templates

    Publication Year: 2014 , Page(s): 1128 - 1133
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2681 KB) |  | HTML iconHTML  

    We present an algorithm that can segment and track partial planar templates, from a sequence of images taken from a moving camera. By “partial planar template”, we mean that the template is the projection of a surface patch that is only partially planar; some of the points may correspond to other surfaces. The algorithm segments each image template to identify the pixels that belong to the dominant plane, and determines the three dimensional structure of that plane. We show that our algorithm can track such patches over a larger visual angle, compared to algorithms that assume that patches arise from a single planar surface. The new tracking algorithm is expected to improve the accuracy of visual simultaneous localization and mapping, especially in outdoor natural scenes where planar features are rare. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust tracking and mapping with a handheld RGB-D camera

    Publication Year: 2014 , Page(s): 1120 - 1127
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2778 KB) |  | HTML iconHTML  

    In this paper, we propose a robust method for camera tracking and surface mapping using a handheld RGB-D camera which is effective in challenging situations such as fast camera motion or geometrically featureless scenes. The main contributions are threefold. First, we introduce a robust orientation estimation based on quaternion method for initial sparse estimation. By using visual feature points detection and matching, no prior or small movement assumption is required to estimate a rigid transformation between frames. Second, a weighted ICP (Iterative Closest Point) method for better rate of convergence in optimization and accuracy in resulting trajectory is proposed. While the conventional ICP fails when there is no 3D features in the scene, our approach achieves robustness by emphasizing the influence of points that contain more geometric information of the scene. Finally, we show quantitative results on an RGB-D benchmark dataset. The experiments on an RGB-D trajectory benchmark dataset demonstrate that our method is able to track camera pose accurately. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Plant classification system for crop /weed discrimination without segmentation

    Publication Year: 2014 , Page(s): 1142 - 1149
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1671 KB) |  | HTML iconHTML  

    This paper proposes a machine vision approach for plant classification without segmentation and its application in agriculture. Our system can discriminate crop and weed plants growing in commercial fields where crop and weed grow close together and handles overlap between plants. Automated crop / weed discrimination enables weed control strategies with specific treatment of weeds to save cost and mitigate environmental impact. Instead of segmenting the image into individual leaves or plants, we use a Random Forest classifier to estimate crop/weed certainty at sparse pixel positions based on features extracted from a large overlapping neighborhood. These individual sparse results are spatially smoothed using a Markov Random Field and continuous crop/weed regions are inferred in full image resolution through interpolation. We evaluate our approach using a dataset of images captured in an organic carrot farm with an autonomous field robot under field conditions. Applying the plant classification system to images from our dataset and performing cross-validation in a leave one out scheme yields an average classification accuracy of 93.8 %. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scale-Space SIFT flow

    Publication Year: 2014 , Page(s): 1112 - 1119
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7671 KB) |  | HTML iconHTML  

    The state-of-the-art SIFT flow has been widely adopted for the general image matching task, especially in dealing with image pairs from similar scenes but with different object configurations. However, the way in which the dense SIFT features are computed at a fixed scale in the SIFT flow method limits its capability of dealing with scenes of large scale changes. In this paper, we propose a simple, intuitive, and very effective approach, Scale-Space SIFT flow, to deal with the large scale differences in different image locations. We introduce a scale field to the SIFT flow function to automatically explore the scale deformations. Our approach achieves similar performance as the SIFT flow method on general natural scenes but obtains significant improvement on the images with large scale differences. Compared with a recent method that addresses the similar problem, our approach shows its clear advantage being more effective, and significantly less demanding in memory and time requirement. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A lp-norm MTMKL framework for simultaneous detection of multiple facial action units

    Publication Year: 2014 , Page(s): 1104 - 1111
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (618 KB) |  | HTML iconHTML  

    Facial action unit (AU) detection is a challenging topic in computer vision and pattern recognition. Most existing approaches design classifiers to detect AUs individually or AU combinations without considering the intrinsic relations among AUs. This paper presents a novel method, lp-norm multi-task multiple kernel learning (MTMKL), that jointly learns the classifiers for detecting the absence and presence of multiple AUs. lp-norm MTMKL is an extension of the regularized multi-task learning, which learns shared kernels from a given set of base kernels among all the tasks within Support Vector Machines (SVM). Our approach has several advantages over existing methods: (1) AU detection work is transformed to a MTL problem, where given a specific frame, multiple AUs are detected simultaneously by exploiting their inter-relations; (2) lp-norm multiple kernel learning is applied to increase the discriminant power of classifiers. Our experimental results on the CK+ and DISFA databases show that the proposed method outperforms the state-of-the-art methods for AU detection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fully automatic 3D facial expression recognition using local depth features

    Publication Year: 2014 , Page(s): 1096 - 1103
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1415 KB) |  | HTML iconHTML  

    Facial expressions form a significant part of our nonverbal communications and understanding them is essential for effective human computer interaction. Due to the diversity of facial geometry and expressions, automatic expression recognition is a challenging task. This paper deals with the problem of person-independent facial expression recognition from a single 3D scan. We consider only the 3D shape because facial expressions are mostly encoded in facial geometry deformations rather than textures. Unlike the majority of existing works, our method is fully automatic including the detection of landmarks. We detect the four eye corners and nose tip in real time on the depth image and its gradients using Haar-like features and AdaBoost classifier. From these five points, another 25 heuristic points are defined to extract local depth features for representing facial expressions. The depth features are projected to a lower dimensional linear subspace where feature selection is performed by maximizing their relevance and minimizing their redundancy. The selected features are then used to train a multi-class SVM for the final classification. Experiments on the benchmark BU-3DFE database show that the proposed method outperforms existing automatic techniques, and is comparable even to the approaches using manual landmarks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Relative facial action unit detection

    Publication Year: 2014 , Page(s): 1090 - 1095
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1207 KB) |  | HTML iconHTML  

    This paper presents a subject-independent facial action unit (AU) detection method by introducing the concept of relative AU detection, for scenarios where the neutral face is not provided. We propose a new classification objective function which analyzes the temporal neighborhood of the current frame to decide if the expression recently increased, decreased or showed no change. This approach is a significant change from the conventional absolute method which decides about AU classification using the current frame, without an explicit comparison with its neighboring frames. Our proposed method improves robustness to individual differences such as face scale and shape, age-related wrinkles, and transitions among expressions (e.g., lower intensity of expressions). Our experiments on three publicly available datasets (Extended Cohn-Kanade (CK+), Bosphorus, and DISFA databases) show significant improvement of our approach over conventional absolute techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel method for post-surgery face recognition using sum of facial parts recognition

    Publication Year: 2014 , Page(s): 1082 - 1089
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1195 KB) |  | HTML iconHTML  

    Plastic surgery is becoming more and more commonplace today due to its increasing acceptance in society and its cost-affordability. This in turn has led to the need for developing highly accurate post-surgery face recognition techniques, a problem space which differs significantly from traditional face recognition. In this paper we first conduct a statistical study to show that facial plastic surgery operations correlate with a desire to conform to a golden ratio with respect to the human face. We then apply this knowledge, with the notion of considering a face in terms of the sum of its parts, to propose a novel face recognition technique. The proposed technique is then evaluated against well known datasets, and as per our experiments achieves a recognition rate of 85.35%, which significantly outperforms other state of the art techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matching image sets via adaptive multi convex hull

    Publication Year: 2014 , Page(s): 1074 - 1081
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1611 KB) |  | HTML iconHTML  

    Traditional nearest points methods use all the samples in an image set to construct a single convex or affine hull model for classification. However, strong artificial features and noisy data may be generated from combinations of training samples when significant intra-class variations and/or noise occur in the image set. Existing multi-model approaches extract local models by clustering each image set individually only once, with fixed clusters used for matching with various image sets. This may not be optimal for discrimination, as undesirable environmental conditions (eg. illumination and pose variations) may result in the two closest clusters representing different characteristics of an object (eg. frontal face being compared to non-frontal face). To address the above problem, we propose a novel approach to enhance nearest points based methods by integrating affine/convex hull classification with an adapted multi-model approach. We first extract multiple local convex hulls from a query image set via maximum margin clustering to diminish the artificial variations and constrain the noise in local convex hulls. We then propose adaptive reference clustering (ARC) to constrain the clustering of each gallery image set by forcing the clusters to have resemblance to the clusters in the query image set. By applying ARC, noisy clusters in the query set can be discarded. Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method outperforms single model approaches and other recent techniques, such as Sparse Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant Analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simultaneous recognition of facial expression and identity via sparse representation

    Publication Year: 2014 , Page(s): 1066 - 1073
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1036 KB) |  | HTML iconHTML  

    Automatic recognition of facial expression and facial identity from visual data are two challenging problems that are tied together. In the past decade, researchers have mostly tried to solve these two problems separately to come up with face identification systems that are expression-independent and facial expressions recognition systems that are person-independent. This paper presents a new framework using sparse representation for simultaneous recognition of facial expression and identity. Our framework is based on the assumption that any facial appearance is a sparse combination of identities and expressions (i.e., one identity and one expression). Our experimental results using the CK+ and MMI face datasets show that the proposed approach outperforms methods that conduct face identification and face recognition individually. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting movie ratings from audience behaviors

    Publication Year: 2014 , Page(s): 1058 - 1065
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7362 KB) |  | HTML iconHTML  

    We propose a method of representing audience behavior through facial and body motions from a single video stream, and use these features to predict the rating for feature-length movies. This is a very challenging problem as: i) the movie viewing environment is dark and contains views of people at different scales and viewpoints; ii) the duration of feature-length movies is long (80-120 mins) so tracking people uninterrupted for this length of time is still an unsolved problem; and iii) expressions and motions of audience members are subtle, short and sparse making labeling of activities unreliable. To circumvent these issues, we use an infrared illuminated test-bed to obtain a visually uniform input. We then utilize motion-history features which capture the subtle movements of a person within a pre-defined volume, and then form a group representation of the audience by a histogram of pair-wise correlations over a small-window of time. Using this group representation, we learn our movie rating classifier from crowd-sourced ratings collected by rottentomatoes.com and show our prediction capability on audiences from 30 movies across 250 subjects (> 50 hrs). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • AutoCaption: Automatic caption generation for personal photos

    Publication Year: 2014 , Page(s): 1050 - 1057
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (10258 KB) |  | HTML iconHTML  

    AutoCaption is a system that helps a smartphone user generate a caption for their photos. It operates by uploading the photo to a cloud service where a number of parallel modules are applied to recognize a variety of entities and relations. The outputs of the modules are combined to generate a large set of candidate captions, which are returned to the phone. The phone client includes a convenient user interface that allows users to select their favorite caption, reorder, add, or delete words to obtain the grammatical style they prefer. The user can also select from multiple candidates returned by the recognition modules. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploring the geo-dependence of human face appearance

    Publication Year: 2014 , Page(s): 1042 - 1049
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5508 KB) |  | HTML iconHTML  

    The expected appearance of a human face depends strongly on age, ethnicity and gender. While these relationships are well-studied, our work explores the little-studied dependence of facial appearance on geographic location. To support this effort, we constructed GeoFaces, a large dataset of geotagged face images. We examine the geo-dependence of Eigenfaces and use two supervised methods for extracting geo-informative features. The first, canonical correlation analysis, is used to find location-dependent component images as well as the spatial direction of most significant face appearance change. The second, linear discriminant analysis, is used to find countries with relatively homogeneous, yet distinctive, facial appearance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving multiview face detection with multi-task deep convolutional neural networks

    Publication Year: 2014 , Page(s): 1036 - 1041
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4726 KB) |  | HTML iconHTML  

    Multiview face detection is a challenging problem due to dramatic appearance changes under various pose, illumination and expression conditions. In this paper, we present a multi-task deep learning scheme to enhance the detection performance. More specifically, we build a deep convolutional neural network that can simultaneously learn the face/nonface decision, the face pose estimation problem, and the facial landmark localization problem. We show that such a multi-task learning scheme can further improve the classifier's accuracy. On the challenging FDDB data set, our detector achieves over 3% improvement in detection rate at the same false positive rate compared with other state-of-the-art methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A discriminative parts based model approach for fiducial points free and shape constrained head pose normalisation in the wild

    Publication Year: 2014 , Page(s): 1028 - 1035
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5663 KB) |  | HTML iconHTML  

    This paper proposes a method for parts-based view-invariant head pose normalisation, which works well even in difficult real-world conditions. Handling pose is a classical problem in facial analysis. Recently, parts-based models have shown promising performance for facial landmark points detection `in the wild'. Leveraging on the success of these models, the proposed data-driven regression framework computes a constrained normalised virtual frontal head pose. The response maps of a discriminatively trained part detector are used as texture information. These sparse texture maps are projected from non-frontal to frontal pose using block-wise structured regression. Finally, a facial kinematic shape constraint is achieved by applying a shape model. The advantages of the proposed approach are: a) no explicit dependence on the outputs of a facial parts detector and, thus, avoiding any error propagation owing to their failure; (b) the application of a shape prior on the reconstructed frontal maps provides an anatomically constrained facial shape; and c) modelling head pose as a mixture-of-parts model allows the framework to work without any prior pose information. Experiments are performed on the Multi-PIE and the `in the wild' SFEW databases. The results demonstrate the effectiveness of the proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The effectiveness of face detection algorithms in unconstrained crowd scenes

    Publication Year: 2014 , Page(s): 1020 - 1027
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1292 KB) |  | HTML iconHTML  

    The 2013 Boston Marathon bombing represents a case where automatic facial biometrics tools could have proven invaluable to law enforcement officials, yet the lack of robustness of current tools in unstructured environments limited their utility. In this work, we focus on complications that confound face detection algorithms. We first present a simple multi-pose generalization of the Viola-Jones algorithm. Our results on the Face Detection Data set and Benchmark (FDDB) show that it makes a significant improvement over the state of the art for published algorithms. Conversely, our experiments demonstrate that the improvements attained by accommodating multiple poses can be negligible compared to the gains yielded by normalizing scores and using the most appropriate classifier for uncontrolled data. We conclude with a qualitative evaluation of the proposed algorithm on publicly available images of the Boston Marathon crowds. Although the results of our evaluations are encouraging, they confirm that there is still room for improvement in terms of robustness to out-of-plane rotation, blur and occlusion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extending explicit shape regression with mixed feature channels and pose priors

    Publication Year: 2014 , Page(s): 1013 - 1019
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2286 KB) |  | HTML iconHTML  

    Facial feature detection offers a wide range of applications, e.g. in facial image processing, human computer interaction, consumer electronics, and the entertainment industry. These applications impose two antagonistic key requirements: high processing speed and high detection accuracy. We address both by expanding upon the recently proposed explicit shape regression [1] to (a) allow usage and mixture of different feature channels, and (b) include head pose information to improve detection performance in non-cooperative environments. Using the publicly available “wild” datasets LFW [10] and AFLW [11], we show that using these extensions outperforms the baseline (up to 10% gain in accuracy at 8% IOD) as well as other state-of-the-art methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison of face detection and image classification for detecting front seat passengers in vehicles

    Publication Year: 2014 , Page(s): 1006 - 1012
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1838 KB) |  | HTML iconHTML  

    Due to the high volume of traffic on modern roadways, transportation agencies have proposed High Occupancy Vehicle (HOV) lanes and High Occupancy Tolling (HOT) lanes to promote car pooling. However, enforcement of the rules of these lanes is currently performed by roadside enforcement officers using visual observation. Manual roadside enforcement is known to be inefficient, costly, potentially dangerous, and ultimately ineffective. Violation rates up to 50%-80% have been reported, while manual enforcement rates of less than 10% are typical. Therefore, there is a need for automated vehicle occupancy detection to support HOV/HOT lane enforcement. A key component of determining vehicle occupancy is to determine whether or not the vehicle's front passenger seat is occupied. In this paper, we examine two methods of determining vehicle front seat occupancy using a near infrared (NIR) camera system pointed at the vehicle's front windshield. The first method examines a state-of-the-art deformable part model (DPM) based face detection system that is robust to facial pose. The second method examines state-of-the-art local aggregation based image classification using bag-of-visual-words (BOW) and Fisher vectors (FV). A dataset of 3000 images was collected on a public roadway and is used to perform the comparison. From these experiments it is clear that the image classification approach is superior for this problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Writer identification and verification using GMM supervectors

    Publication Year: 2014 , Page(s): 998 - 1005
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1291 KB) |  | HTML iconHTML  

    This paper proposes a new system for offline writer identification and writer verification. The proposed method uses GMM supervectors to encode the feature distribution of individual writers. Each supervector originates from an individual GMM which has been adapted from a background model via a maximum-a-posteriori step followed by mixing the new statistics with the background model. We show that this approach improves the TOP-1 accuracy of the current best ranked methods evaluated at the ICDAR-2013 competition dataset from 95.1% [13] to 97.1%, and from 97.9% [11] to 99.2% at the CVL dataset, respectively. Additionally, we compare the GMM supervector encoding with other encoding schemes, namely Fisher vectors and Vectors of Locally Aggregated Descriptors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finger-knuckle-print verification based on vector consistency of corresponding interest points

    Publication Year: 2014 , Page(s): 992 - 997
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (733 KB) |  | HTML iconHTML  

    This paper proposes a novel finger-knuckle-print (FKP) verification method based on vector consistency among corresponding interest points (CIPs) detected from aligned finger images. We used two different approaches for reliable detection of CIPs; one method employs SIFT features and captures gradient directionality, and the other method employs phase correlation to represent the intensity field surrounding an interest point. The consistency of interframe displacements between pairs of matching CIPs in a match pair is used as a matching score. Such displacements will show consistency in a genuine match but not in an impostor match. Experimental results show that the proposed approach is effective in FKP verification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive representations for video-based face recognition across pose

    Publication Year: 2014 , Page(s): 984 - 991
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1851 KB) |  | HTML iconHTML  

    In this paper, we address the problem of matching faces across changes in pose in unconstrained videos. We propose two methods based on 3D rotation and sparse representation that compensate for changes in pose. The first is Sparse Representation-based Alignment (SRA) that generates pose aligned features under a sparsity constraint. The mapping for the pose aligned features are learned from a reference set of face images which is independent of the videos used in the experiment. Thus, they generalize across data sets. The second is a Dictionary Rotation (DR) method that directly rotates video dictionary atoms in both their harmonic basis and 3D geometry to match the poses of the probe videos. We demonstrate the effectiveness of our approach over several state-of-the-art algorithms through extensive experiments on three challenging unconstrained video datasets: the video challenge of the Face and Ocular Challenge Series (FOCS), the Multiple Biometrics Grand Challenge (MBGC), and the Human ID datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Iris crypts: Multi-scale detection and shape-based matching

    Publication Year: 2014 , Page(s): 977 - 983
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1897 KB) |  | HTML iconHTML  

    This paper presents an improved framework for iris crypt detection and matching that outperforms both previous methods and manual annotations. The system uses a multi-scale pyramid architecture to detect feature candidates before they are further examined and optimized by heuristic-based methods. The dissimilarity between irises are measured by a two-stage matcher in the simple to complex order. The first stage estimates the global dissimilarity and rejects the majority of unmatching candidates. The surviving pairs are matched by local dissimilarities between each crypt pair using shape descriptors. The proposed framework showed significant performance improvement in both identification and verification context. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Active Clustering with Ensembles for Social structure extraction

    Publication Year: 2014 , Page(s): 969 - 976
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1850 KB) |  | HTML iconHTML  

    We introduce a method for extracting the social network structure for the persons appearing in a set of video clips. Individuals are unknown, and are not matched against known enrollments. An identity cluster representing an individual is formed by grouping similar-appearing faces from different videos. Each identity cluster is represented by a node in the social network. Two nodes are linked if the faces from their clusters appeared together in one or more video frames. Our approach incorporates a novel active clustering technique to create more accurate identity clusters based on feedback from the user about ambiguously matched faces. The final output consists of one or more network structures that represent the social group(s), and a list of persons who potentially connect multiple social groups. Our results demonstrate the efficacy of the proposed clustering algorithm and network analysis techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.