• ### Tracking-by-Detection of 3D Human Shapes: from Surfaces to Volumes

Publication Year: 2017, Page(s): 1
3D Human shape tracking consists in fitting a template model to temporal sequences of visual observations. It usually comprises an association step, that finds correspondences between the model and the input data, and a deformation step, that fits the model to the observations given correspondences. Most current approaches follow the Iterative-Closest-Point (ICP) paradigm, where the association st... View full abstract»

• ### Gaussian Process Morphable Models

Publication Year: 2017, Page(s): 1
Models of shape variations have become a central component for the automated analysis of images. An important class of shape models are point distribution models (PDMs). These models represent a class of shapes as a normal distribution of point variations, whose parameters are estimated from example shapes. Principal component analysis (PCA) is applied to obtain a low-dimensional representation of... View full abstract»

• ### Photorealistic Monocular Gaze Redirection Using Machine Learning

Publication Year: 2017, Page(s): 1
We propose a general approach to the gaze redirection problem in images that utilizes machine learning. The idea is to learn to re-synthesize images by training on pairs of images with known disparities between gaze directions. We show that such learning-based re-synthesis can achieve convincing gaze redirection based on monocular input, and that the learned systems generalize well to people and i... View full abstract»

• ### Simultaneous Clustering and Model Selection: Algorithm, Theory and Applications

Publication Year: 2017, Page(s): 1
While clustering has been well studied in the past decade, model selection has drawn much less attention due to the difficulty of the problem. In this paper, we address both problems in a joint manner by recovering an ideal affinity tensor from an imperfect input. By taking into account the relationship of the affinities induced by the cluster structures, we are able to significantly improve the a... View full abstract»

• ### Transduction on Directed Graphs via Absorbing Random Walks

Publication Year: 2017, Page(s): 1
In this paper we consider the problem of graph-based transductive classification, and we are particularly interested in the directed graph scenario which is a natural form for many real world applications.Different from existing research efforts that either only deal with undirected graphs or circumvent directionality by means of symmetrization, we propose a novel random walk approach on directed ... View full abstract»

• ### Jointly learning deep features, deformable parts, occlusion and classification for pedestrian detection

Publication Year: 2017, Page(s): 1
Feature extraction, deformation handling, occlusion handling, and classification are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation... View full abstract»

• ### Faceness-Net: Face Detection through Deep Facial Part Responses

Publication Year: 2017, Page(s): 1
We propose a deep convolutional neural network (CNN) for face detection leveraging on facial attributes based supervision. We observe a phenomenon that part detectors emerge within CNN trained to classify attributes from uncropped face images, without any explicit part supervision. The observation motivates a new method for finding faces through scoring facial parts responses by their spatial stru... View full abstract»

• ### Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach

Publication Year: 2017, Page(s): 1
Face attribute estimation has many potential applications in video surveillance, face retrieval, and social media. While a number of methods have been proposed for face attribute estimation, most of them did not explicitly consider the attribute correlation and heterogeneity (e.g., ordinal vs. nominal and holistic vs. local) during feature representation learning. In this paper, we present a Deep ... View full abstract»

• ### Piecewise-Planar StereoScan: Sequential Structure and Motion using Plane Primitives

Publication Year: 2017, Page(s): 1
The article describes a pipeline that receives as input a sequence of stereo images, and outputs the camera motion and a Piecewise-Planar Reconstruction (PPR) of the scene. The pipeline, named Piecewise-Planar StereoScan (PPSS), works as follows: the planes in the scene are detected for each stereo view using semi-dense depth estimation; the relative pose is computed by a new closed-form minimal a... View full abstract»

• ### Best-Buddies Similarity - Robust Template Matching using Mutual Nearest Neighbors

Publication Year: 2017, Page(s): 1
We propose a novel method for template matching in unconstrained environments. Its essence is the Best-Buddies Similarity (BBS), a useful, robust, and parameter-free similarity measure between two sets of points. BBS is based on counting the number of Best-Buddies Pairs (BBPs)-pairs of points in source and target sets, where each point is the nearest neighbor of the other. BBS has several key feat... View full abstract»

• ### Deep Learning Markov Random Field for Semantic Segmentation

Publication Year: 2017, Page(s): 1
Semantic segmentation tasks can be well modeled by Markov Random Field (MRF). This paper addresses semantic segmentation by incorporating high-order relations and mixture of label contexts into MRF. Unlike previous works that optimized MRFs using iterative algorithm, we solve MRF by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-t... View full abstract»

• ### Simultaneous Local Binary Feature Learning and Encoding for Homogeneous and Heterogeneous Face Recognition

Publication Year: 2017, Page(s): 1
In this paper, we propose a simultaneous local binary feature learning and encoding (SLBFLE) approach for both homogeneous and heterogeneous face recognition. Unlike existing hand-crafted face descriptors such as local binary pattern (LBP) and Gabor features which usually require strong prior knowledge, our SLBFLE is an unsupervised feature learning approach which automatically learns face represe... View full abstract»

• ### ELD-Net: An efficient deep learning architecture for accurate saliency detection

Publication Year: 2017, Page(s): 1
Recent advances in saliency detection have utilized deep learning to obtain high-level features to detect salient regions in scenes. In this paper, we propose ELD-Net, a unified deep learning framework for accurate and efficient saliency detection. We show that hand-crafted features can provide complementary information to enhance saliency detection that uses only high-level features. Our method u... View full abstract»

• ### Zero-Shot Learning on Semantic Class Prototype Graph

Publication Year: 2017, Page(s): 1
Zero-Shot Learning (ZSL) for visual recognition is typically achieved by exploiting a semantic embedding space. In such a space, both seen and unseen class labels as well as image features can be embedded so that the similarity among them can be measured directly. In this work, we consider that the key to effective ZSL is to compute an optimal distance metric in the semantic embedding space. Exist... View full abstract»

• ### Kronecker-Basis-Representation Based Tensor Sparsity and Its Applications to Tensor Recovery

Publication Year: 2017, Page(s): 1
It is well known that the sparsity/low-rank of a vector/matrix can be rationally measured by nonzero-entries-number ($l_0$ norm)/nonzero- singular-values-number (rank), respectively. However, data from real applications are often generated by the interaction of multiple factors, which obviously cannot be sufficiently represented by a vector/matrix, while a high order tensor is expected to provide ... View full abstract»

• ### Visual Recognition in RGB Images and Videos by Learning from RGB-D Data

Publication Year: 2017, Page(s): 1
In this work, we propose a new framework for recognizing RGB images or videos by leveraging a set of labeled RGB-D data, in which the depth features can be additionally extracted from the depth images or videos. We formulate this task as a new unsupervised domain adaptation (UDA) problem, in which we aim to take advantage of the additional depth features in the source domain and also cope with the... View full abstract»

• ### Two-Stream Transformer Networks for Video-based Face Alignment

Publication Year: 2017, Page(s): 1
In this paper, we propose a two-stream transformer networks (TSTN) approach for video-based face alignment. Unlike conventional image-based face alignment approaches which cannot explicitly model the temporal dependency in videos and motivated by the fact that consistent movements of facial landmarks usually occur across consecutive frames, our TSTN aims to capture the complementary information of... View full abstract»

• ### Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

Publication Year: 2017, Page(s): 1
Online handwritten Chinese text recognition (OHCTR) is a challenging problem as it involves a large-scale character set, ambiguous segmentation, and variable-length input sequences. In this paper, we exploit the outstanding capability of path signature to translate online pen-tip trajectories into informative signature feature maps, successfully capturing the analytic and geometric properties of p... View full abstract»

• ### Robust Online Matrix Factorization for Dynamic Background Subtraction

Publication Year: 2017, Page(s): 1
We propose an effective online background subtraction method, which can be robustly applied to practical videos that have variations in both foreground and background. Different from previous methods which often model the foreground as Gaussian or Laplacian distributions, we model the foreground for each frame with a specific mixture of Gaussians (MoG) distribution, which is updated online frame b... View full abstract»

• ### Attribute And-Or Grammar for Joint Parsing of Human Pose, Parts and Attributes

Publication Year: 2017, Page(s): 1
This paper presents an attribute and-or grammar (A-AOG) model for jointly inferring human body pose and human attributes in a parse graph with attributes augmented to nodes in the hierarchical representation. In contrast to other popular methods in the current literature that train separate classifiers for poses and individual attributes, our method explicitly represents the decomposition and arti... View full abstract»

• ### Maximum Persistency via Iterative Relaxed Inference with Graphical Models

Publication Year: 2017, Page(s): 1
We consider the NP-hard problem of MAP-inference for undirected discrete graphical models. We propose a polynomial time and practically efficient algorithm for finding a part of its optimal solution. Specifically, our algorithm marks some labels of the considered graphical model either as (i) optimal, meaning that they belong to all optimal solutions of the inference problem; (ii) non-optimal if t... View full abstract»

• ### Collocation for Diffeomorphic Deformations in Medical Image Registration

Publication Year: 2017, Page(s): 1
Diffeomorphic deformation is a popular choice in medical image registration. A fundamental property of diffeomorphisms is in vertibility, implying that once the relation between two points A to B is found, then the relation B to A is given per definition. Consistency is a measure of a numerical algorithm's ability to mimic this invertibility, and achieving consistency has proven to be a cha... View full abstract»

• ### Semantic Object Segmentation in Tagged Videos via Detection

Publication Year: 2017, Page(s): 1
Semantic object segmentation (SOS) is a challenging task in computer vision that aims to detect and segment all pixels of the objects within predefined semantic categories. In image-based SOS, many supervised models have been proposed and achieved impressive performances due to the rapid advances of well-annotated training images and machine learning theories. However, in video-based SOS it is oft... View full abstract»

• ### Learning and Inferring "Dark Matter" and Predicting Human Intents and Trajectories in Videos

Publication Year: 2017, Page(s): 1
This paper presents a method for localizing functional objects and predicting human intents and trajectories in surveillance videos of public spaces, under no supervision in training. People in public spaces are expected to intentionally take shortest paths (subject to obstacles) toward certain objects (e.g. vending machine, picnic table, dumpster etc.) where they can satisfy certain needs (e.g., ... View full abstract»

• ### Joint Multi-Leaf Segmentation, Alignment, and Tracking from Fluorescence Plant Videos

Publication Year: 2017, Page(s): 1
This paper proposes a novel framework for fluorescence plant video processing. The plant research community is interested in the leaf-level photosynthetic analysis within a plant. A prerequisite for such analysis is to segment all leaves, estimate their structures, and track them over time. We identify this as a joint multi-leaf segmentation, alignment, and tracking problem. First, leaf segmentati... View full abstract»

