1 Introduction
Face recognition (FR) has a wide variety of applications including surveillance, access-control, health-care, advertisement etc. Owing to its significance, it is a widely studied topic in computer vision literature. Recently, deep convolution neural network (DCNN) based solutions [1], [2], [3], [4], [5], [6], [7] have seen remarkable success in FR applications and these methods have replaced the classical FR techniques altogether. Generally, all state-of-art CNN based systems rely on the following procedure: in the training phase, a deep CNN is trained using a large scale datasets such as CasiaWeb [8] and/or MS-Celeb1M [9]. Some preprocessing such as face detection and alignment is carried out before training, and a suitable loss function, such as triplet loss [6], normalised Softmax [4], ArcFace [2] etc., is used to train the network. Once the training is complete, the loss layer is discarded and the output of the CNN (an n-dimensional vector with n usually equal to 512 or 2048) is treated as the feature vector corresponding to a given input face image. In the testing phase, a pair of inputs is fed to the trained network and the cosine similarity of the resulting feature vectors is evaluated. If the score is greater than a given threshold, then the image pair is recognised as belonging to the same identity.