• ### Safe Classification With Augmented Features

With the evolution of data collection methods, it is possible to produce abundant data described by multiple feature sets. Previous studies show that including more features does not necessarily bring positive effects. How to prevent the augmented features from worsening classification performance is crucial but rarely studied. In this paper, we study this challenging problem by proposing a safe c... View full abstract»

• ### On Detection, Data Association and Segmentation for Multi-target Tracking

In this work, we propose a tracker that differs from most existing multi-target trackers in two major ways. Firstly, our tracker does not rely on a pre-trained object detector to get the initial object hypotheses. Secondly, our tracker's final output is the fine contours of the targets rather than traditional bounding boxes. Therefore, our tracker simultaneously solves three main problems: ... View full abstract»

• ### Globally-Optimal Inlier Set Maximisation for Camera Pose and Correspondence Estimation

Estimating the 6-DoF pose of a camera from a single image relative to a 3D point-set is an important task for many computer vision applications. Perspective-n-point solvers are routinely used for camera pose estimation, but are contingent on the provision of good quality 2D-3D correspondences. However, finding cross-modality correspondences between 2D image points and a 3D point-set is non-trivial... View full abstract»

• ### Bearing-based Network Localizability: A Unifying View

This paper provides a unifying view and offers new insights on bearing-based network localizability, that is the problem of establishing whether a set of directions between pairs of nodes uniquely determines (up to translation and scale) the position of the nodes in d-space. If nodes represent cameras then we are in the context of global structure from motion. The contribution of the paper is theo... View full abstract»

• ### Binary Multi-View Clustering

Clustering is a long-standing important research problem, however, remains challenging when handling large-scale image data from diverse sources. In this paper, we present a novel Binary Multi-View Clustering (BMVC) framework, which can dexterously manipulate multi-view image data and easily scale to large data. To achieve this goal, we formulate BMVC by two key components: compact collaborative d... View full abstract»

• ### Distributed multi-agent Gaussian regression via finite-dimensional approximations

We consider the problem of distributedly estimating Gaussian processes in multi-agent frameworks. Each agent collects few measurements and aims to collaboratively reconstruct a common estimate based on all data. Agents are assumed with limited computational and communication capabilities and to gather $M$noisy measurements in total on input locations independently dra... View full abstract»

• ### Opening the Black Box: Hierarchical Sampling Optimization for Hand Pose Estimation

Hand pose estimation, formulated as an inverse problem, is typically optimized by an energy function over pose parameters using a ‘black box’ image generation procedure, knowing little about either the relationships between the parameters or the form of the energy function. In this paper, we show significant improvement upon the black box optimization by exploiting high-level knowled... View full abstract»

• ### Salient Subsequence Learning for Time Series Clustering

Time series has been a popular research topic over the past decade. Salient subsequences of time series that can benefit the learning task, e.g. classification or clustering, are called shapelets. Shapelet-based time series learning extracts these types of salient subsequences with highly informative features from a time series. Most existing methods for shapelet discovery must scan a large pool o... View full abstract»

• ### Fine-tuning CNN Image Retrieval with No Human Annotation

Image descriptors based on activations of Convolutional Neural Networks (CNNs) have become dominant in image retrieval due to their discriminative power, compactness of representation, and search efficiency. Training of CNNs, either from scratch or fine-tuning, requires a large amount of annotated data, where a high quality of annotation is often crucial. In this work, we propose to fine-tune CNNs... View full abstract»

• ### Differential Geometry in Edge Detection: accurate estimation of position, orientation and curvature

The vast majority of edge detection literature has aimed at improving edge recall and precision, with relatively few addressing the accuracy of edge orientation estimates which are often based on gradient. We show that first-order estimates of orientation can have significant error and this can be remedied by employing Third-Order estimates. This paper aims at estimating differential geometry attr... View full abstract»

• ### Salient Object Detection with Recurrent Fully Convolutional Networks

Deep networks have been proved to encode high-level features with semantic meaning and delivered superior performance in salient object detection. In this paper, we take one step further by developing a new saliency detection method based on recurrent fully convolutional networks (RFCNs). Compared with existing deep network based methods, the proposed network is able to incorporate saliency prior ... View full abstract»

• ### Efficient Learning-Free Keyword Spotting

In this article, a method for segmentation-based learning-free Query by Example (QbE) keyword spotting on handwritten documents is proposed. The method consists of three steps, namely preprocessing, feature extraction and matching, which address critical variations of text images (e.g. skew, translation, different writing styles). During the feature extraction step, a sequence of descriptors is ge... View full abstract»

• ### lp-Box ADMM: A Versatile Framework for Integer Programming

This paper revisits the integer programming (IP) problem, which plays a fundamental role in many computer vision and machine learning applications. The literature abounds with many seminal works that address this problem, some focusing on continuous approaches (e.g., linear program relaxation), while others on discrete ones (e.g., min-cut). However, since many of these methods are designed to solv... View full abstract»

• ### Predicting the Driver's Focus of Attention: the DR(eye)VE Project

In this work we aim to predict the driver's focus of attention. The goal is to estimate what a person would pay attention to while driving, and which part of the scene around the vehicle is more critical for the task. To this end we propose a new computer vision model based on a multi-branch deep architecture that integrates three sources of information: raw video, motion and scene semantic... View full abstract»

• ### Light Field Reconstruction Using Convolutional Network on EPI and Extended Applications

In this paper, a novel convolutional neural network (CNN)-based framework is developed for light field reconstruction from a sparse set of views. We indicate that the reconstruction can be efficiently modeled as angular restoration on an epipolar plane image (EPI). The main problem in direct reconstruction on the EPI involves an information asymmetry between the spatial and angular dimensions, whe... View full abstract»

• ### Density-Preserving Hierarchical EM Algorithm: Simplifying Gaussian Mixture Models for Approximate Inference

We propose an algorithm for simplifying a finite mixture model into a reduced mixture model with fewer mixture components. The reduced model is obtained by maximizing a variational lower bound of the expected log-likelihood of a set of virtual samples. We develop three applications for our mixture simplification algorithm: recursive Bayesian filtering using Gaussian mixture model posteriors, KDE m... View full abstract»

• ### Few-Example Object Detection with Model Communication

In this paper, we study object detection using a large pool of unlabeled images and only a few labeled images per category, named "few-example object detection". The key challenge consists in generating trustworthy training samples as many as possible from the pool. Using few training examples as seeds, our method iterates between model training and high-confidence sample selection. In training, e... View full abstract»

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognit... View full abstract»

• ### Pólya Urn Latent Dirichlet Allocation: a doubly sparse massively parallel sampler

Latent Dirichlet Allocation (LDA) is a topic model widely used in natural language processing and machine learning. Most approaches to training the model rely on iterative algorithms, which makes it difficult to run LDA on big corpora that are best analyzed in parallel and distributed computational environments. Indeed, current approaches to parallel inference either don't converge to the c... View full abstract»

• ### Feedback Convolutional Neural Network for Visual Localization and Segmentation

Feedback is a fundamental mechanism existing in the human visual system, but has not been explored deeply in designing computer vision algorithms. In this paper, we claim that feedback plays a critical role in understanding convolutional neural networks (CNNs), e.g., how a neuron in CNN describes an object's pattern, and how a collection of neurons form comprehensive perception to an object... View full abstract»

• ### Unifying Visual Attribute Learning with Object Recognition in a Multiplicative Framework

Attributes are mid-level semantic properties of objects. Recent research has shown that visual attributes can benefit many typical learning problems in computer vision community. However, attribute learning is still a challenging problem as the attributes may not always be predictable directly from input images and the variation of visual attributes is sometimes large across categories. In this pa... View full abstract»

• ### Error Backprojection Algorithms for Non-Line-of-Sight Imaging

Recent advances in computer vision and inverse light transport theory have resulted in several non-line-of-sight imaging techniques. These techniques use photon time-of-flight information encoded in light after multiple, diffuse reflections to reconstruct a three-dimensional scene. In this paper, we propose and describe two iterative backprojection algorithms, the additive error backprojection (AE... View full abstract»

• ### DeepIGeoS: A Deep Interactive Geodesic Framework for Medical Image Segmentation

Accurate medical image segmentation is essential for diagnosis, surgical planning and many other applications. Convolutional Neural Networks (CNNs) have become the state-of-the-art automatic segmentation methods. However, fully automatic results may still need to be refined to become accurate and robust enough for clinical use. We propose a deep learning-based interactive segmentation method to im... View full abstract»

• ### Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition

Heterogeneous face recognition (HFR) aims at matching facial images acquired from different sensing modalities with mission-critical applications in forensics, security and commercial sectors. However, HFR presents more challenging issues than traditional face recognition because of the large intra-class variation among heterogeneous face images and the limited availability of training samples of ... View full abstract»

• ### HeadFusion: 360° Head Pose tracking combining 3D Morphable Model and 3D Reconstruction

Head pose estimation is a fundamental task for face and social related research. Although 3D morphable model (3DMM) based methods relying on depth information usually achieve accurate results, they usually require frontal or mid-profile poses which preclude a large set of applications where such conditions can not be garanteed, like monitoring natural interactions from fixed sensors placed in the ... View full abstract»

