Point Pattern Feature-Based Anomaly Detection for Manufacturing Defects, in the Random Finite Set Framework

Defect detection in the manufacturing industry is of utmost importance for product quality inspection. Recently, optical defect detection has been investigated as anomaly detection using different deep learning methods. Most current works employ feature extraction methods that describe the entire image using a single feature vector, called the global feature. However, the use of the global feature is affected by changes in several factors, such as lighting and viewpoint changes. An alternative is to use point pattern features known as local features or keypoints which are robust to changes in conditions mentioned earlier. The use of robust point pattern features, such as SIFT, for defect detection using the recently developed set-based methods is not yet explored. This paper proposes the use of point pattern features within a random finite set framework for defect detection. Also, we evaluate different point pattern feature detectors and descriptors, handcrafted point pattern features (e.g., SIFT), and pre-trained deep features, for defect detection applications. Experiments on a large-scale defect detection dataset (MVTec-AD) are carried out. The results are compared with state-of-the-art global feature-based anomaly detection methods. Results show that using point pattern features as data points within random finite set-based anomaly detection, achieves the most consistent defect detection accuracy on the MVTec-AD dataset. Also, this evaluation shows that transfer learning of deep features has promising results for defect detection.


I. INTRODUCTION
Automated visual inspection is a key part of the quality control process in many manufacturing applications and overpowers the limitation of human visual inspection.Manual visual inspection errors commonly include missing or incorrect identification of defects.Such errors can have a significant impact on product quality and may lead to unnecessary increase in production cost and overall waste.Therefore, an automated computer vision-based visual inspection is the key to productivity [1].Different data-driven methods based on deep learning have been proposed for visual inspection in different areas, such as manufacturing [2]- [5], civil engineering [6]- [8], transportation [9], [10] and computing systems [11], [12].
Earlier, a number of automated solution has been developed using different image processing algorithms such as Haar filler for tile surface inspection [13], local order binary pattern for fabric defect detection [14], and scale-invariant keypoint features for PCB inspection [15].Furthermore, a combination of image processing and machine learning algorithms has shown a satisfactory performance.An example of this is railway inspection using the histogram of oriented gradients (HOG) with support vector machine (SVM) [16].Fabric inspection using adaptive boosting with HOG features and SVM was also proposed in [17].However, a major drawback of image processing techniques is their focus on the use of implicit engineered features which can be very challenging when describing complex cases.
Recently, the success of deep learning methods in computer vision [18], [19] using convolution neural networks has paved the way to apply deep learning for visual inspection.These methods use the data representation learning to perform different tasks where the goal is to transform complex data to abstract representations known as features.Wang et al. [4] highlighted the power of deep learning for smart manufacturing and how this changes the future industry trends.Different deep learning tools have been proposed to deal with different surface defects [20], [21].Hui et al. [22] proposed a LEDNet network for defect detection and classification on LED chips.Cha et al. [6] proposed structural damage detection using Faster R-CNN to detect five different defects.
The use of deep learning methods, such as convolutional neural networks (CNNs) is limited by the availability of the training samples which gives rise to two problems: class imbalance between the normal and defected samples and the difficulty of data annotation.Collecting a large number of unlabeled images (normal and with defect) is simple but labeling these samples is expensive and requires a trained inspector.These two problems are well-know and subject of continuing research [23]- [25].Therefore, treating defect detection as an anomaly detection problem is the ideal solution for automated visual defect detection due to the lack of defective samples and their uncertainty.Recently, Bergmann et al. [26] proposed an evaluation of different deep anomaly detection methods for defect detection.They also presented a new dataset (MVTec) which includes a variety of different objects and textures with different types of defects.However, the aforementioned work does not explore the use of point pattern features for defect detection within Random finite set framework.
Vo et al. [27] proposed to model the point pattern features (e.g SIFT) using point processes via multiple instance learning for anomaly detection.The proposed approach is based on treating each point pattern feature as a sample of random finite set (RFS).In addition, a likelihood function based on the RFS density assumptions (e.g Possion IDD clusters) was proposed.The proposed approach allows to incorporate both the number of extracted features (cardinality) information and feature density information into the anomaly detection solution.In this framework, the cardinality of features contributes to the RFS density.
In this paper, we provide an evaluation of different point pattern features for defect detection within RFS framework.The main goal of this study to explore the advantage of RFSbased anomaly detection for defect detection and to show how this approach can generate competitive results compared with current state-of-the-art deep learning approaches.In this evaluation, we have examined different state-of-the-art point pattern features that have been used for matching, as well as the well-known SIFT keypoint detector and descriptor.

II. RANDOM FINITE SET-BASED DEFECT DETECTION
The random finite set-based anomaly detection method models the point pattern features as an RFS.The rationale

III. EVALUATION METHODOLOGY
In this section, we discuss the use of random finite setbased anomaly detection for defect detection.The general proposed approach of the RFS defect detection is shown in Figure 1.The proposed approach consists of two steps.The first step is a point pattern feature extractor and descriptor and the second part is a feature modeling and defect detection using RFS framework.

A. Point pattern feature extractor and descriptor
The term point pattern feature refers to any feature extraction pipeline that returns a set of keypoints (interest points) rather than vector based feature.These keypoints are 2D locations in an image which should be stable and repeatable against different lighting conditions and viewpoints.Point pattern features have been used in different computer vision tasks, such as Simultaneous Localization and Mapping (SLAM), camera calibration, Structure-from-motion (SfM) and image matching [36].Generally, most of the local feature detection methods are in the form of a set such as in SIFT [15]; while global feature detection methods are commonly return features in vector format, such as in the Histogram of Oriented Gradients (HOG) [37].In machine learning, the traditional approach is to convert the point pattern features into vector format by using different methods such Bag of visual world [38].
The most well-known handcrafted point pattern feature detection method that has been used in this evaluation is a Harrisis that both the feature elements xi (i = 1, . . ., n) in the set Laplace point detector which uses Harris corner detector to detect keypoints which are scale invariant.Then, a descriptor X and the number of features, X = n, vary randomly.Accordingly, each measured set of point features X = xi |X| is treated as an RFS.The RFS density with respect to some measure U is given by [28]: where pc(n) = P ( X = n) is the cardinality distribution, U is the unit hyper-volume, and p(x1, . . ., xn) is a symmetric joint feature density for the given cardinality |X| = n [29].
In practice, different assumptions about the mathematical form of the RFS density p(X) can be considered.Examples include, Poisson RFS [30], Beta RFS [31], the Bernoulli RFS [28], the multi-Bernoulli RFS [32], the IDD-cluster RFS [33], the labeled multi-Bernoulli RFS [34] and finally the generlized labeled multi-Bernoulli RFS [35].The aforementioned densities have been used for multi-object tracking in which they treat multi-object entity as a random finite set.An independent identical distributed (IDD) cluster RFS density as follows: where p( ) is the feature density and [p( )] X is the finite set exponential.With IDD cluster RFS, we need to make two different assumptions about the mathematical form of the pc and p( ).If we assume the cardinality distribution follows Poisson distribution.Then, the IDD-cluster RFS turns into Poisson RFS given by: where ρ is the non-negative Poisson intensity.such as SIFT [15] around these keypoints is calculated where the size of the area depends on the maximum scale of the Laplacian-of-Gaussians [39].SIFT descriptor was proposed by Lowe [15] to describe the local shape around keypoints using the edge orientation histogram.SIFT descriptor has shown a good performance in different local feature localization and matching applications.Different color variants of SIFT have been proposed [40], such as Hue-SIFT, color-SIFT and opponent-SIFT, to address the illumination variations issues.Evaluation of different handcrafted keypoints detection and descriptors is performed in [41], [42].
Deep learning using convolution neural networks has shown to be superior to handcrafted features representation in different computer vision tasks, such as human pose estimation [43] and object detection algorithms [44] that require images as input.As a result, convolution neural network has been used for keypoint detection and descriptor in many applications [45]- [48].
In this evaluation, we use a different point pattern handcrafted and deep learned feature set with the Random finite set framework for feature learning and defect detection.

B. Defect detection using RFS
Due to the lack of access to the defected samples, unsupervised anomaly detection is a preferred option for defect detection [26].In this approach, only the normal samples (defectfree) are used in the training phase.Similarly, the RFS-based defect detection only uses the normal samples during training to maximize RFS set density in which the parameter of the model could be learned using either the maximum likelihood estimator (MLE) or expectation maximization (EM) [31].
is on the current state-of-the-art deep learned features.Thus, the following deep learning point pattern feature detection and descriptor has been used.LF-net [46]: is an unsupervised learned network that use detect-then-describe strategy in one end-to-end network.the network output a descriptor with dimension 255-D.D2-net [47]: this network provides joint learning for detection and description.During the inference, the network provide a multi-scale option, in which we call this network as (D2-net2).r2d2 [48]: this network provides joint learning of detection and description.This network learns both keypoint repeatability and confidence from the training set.r2d2 network is a self-supervision trained network using synthetic and real images.
Finally, we have used the most well-known handcrafted feature detection and description which is SIFT [15] and Harris-Laplace point detector because they have shown good performance for category recognition.

C. Experimental Results
The proposed RFS-based defect detection using different

A. Dataset
IV. EXPERIMENTS point pattern feature detection and description is evaluated on MVTec AD dataset.For the sake of comparison, we have compared the proposed approach with the state-of-the-MVTec AD dataset: MVTecAD [26] is a comprehensive and challenging real-world industrial image dataset that is developed for defect detection.The dataset has an extensive collection of texture and object images.It has 5354 highresolution color images of a variety of objects and textures.In MVTech AD, there are 15 different categories (ten objects and five textures).Each object has only normal samples for training and normal and defect samples for testing.There are 70 different types of defects, such as scratches, dents, contamination, and various structural deformations.Figure 2 shows examples of these samples.The first row shows the defectfree samples, while the second row shows defect samples.

B. feature extraction
In order to extract sparse local features from the defect-free samples, a different point pattern features extraction methods have been used in this evaluation.The focus of this paper art deep anomaly detection given in [26] which are as follows: AE(SSIM): deep auto-encoder using SSIM as loss function.AE(L2): deep auto-encode using pixel wise L2 loss function.AnoGAN: anomaly detection using Generative adversarial networks.CNN: convolutional neural network feature dictionary [49].Texture inspection: Gaussian mixture model for texture inspection [50].variation model: using varitional model [51] of GMM for non-texture images is used by providing prior alignment of the object countours.The implementation details and setups of these methods can be found in [26].As an Evaluation metric, we have used the same approach used in [26] in reporting the defect detection accuracy.We have calculated the accuracy of correctly classifying images for normal and defect samples.
Table I shows the ratio of correctly classified (normal and defect) samples for each object and the mean of these ratios.We rank the mean of each object (the lower is better) of these  methods and the final rank of the average (Avg.)rank is shown in the last row.It is clear that the RFS-based defect detection using (SIFT) has the best performance (ranks first) followed with auto-encoder-based methods (rank second and third).RFS (LF-net) and RFS (r2d2) shows better performance compared to AnoGan and CNN-dictionary methods.We can see that RFS-based defect detection shows a promising performance compared with the state-of-the-art methods for object-based defect detection.
Table II shows performance results of five texture images.Similar to table I, the final rank also has been recorded.The RFS (SIFT) method has the best performance (ranks the first) and CNN-dictionary has second better performance (ranks the second) compared to others.In conclusion, we can see from table I and II that RFS (SIFT) has the best performance compared to the different deep learning methods.In addition, RFS (LF-net) and RFS (r2d2) show better consistent performance for texture and object images.

D. Discussion
Table I and II demonstrate the effect of using point pattern features within RFS for defect detection.Most of the current deep learning methods have structured data as input and produce structured feature during representation learning.Point pattern features detection methods provide sparse features (set of features) but these features are usually converted to structured data, such as using Bag of visual world (BoVW) [52].Random finite set framework provides an elegant way to model feature set density that takes into account both cardinality and feature density.The main assumption here is that the defected samples will have more/less number of feature points compared to normal samples.The main reason why RFS (SIFT) works better or similar to the state-of-art methods is because RFS density considers both cardianlity and feature density information.The cardinality of these feature will have a distribution for normal samples and defect samples should have more/less number of features.It can be observed that using deep point feature detection, such as LF-net does not introduce any advantage compared to SIFT and the reason for this is that this model has not been trained on MVTec-AD dataset.However, most of these networks require ground-truth during the training and there is no ground-truth annotation for MVTec-AD dataset.Therefore, these deep learning methods perform poorly for feature extraction but still generate comparative results when combined with RFS-based defect detection.

V. CONCLUSION
In this paper, an evaluation of different point pattern feature detection and description within Random finite set framework for defect detection has been presented.Random finite set framework has been used to model the cardinality and density of these features as a sophisticated way to build a model that best fits the normal samples by maximizing the log-likelihood.Different deep interest keypoint detection and descriptors have been used in this paper and the focus was on the data driven approaches (deep learning methods).The experimental results show that using keypoint feature extraction (especially SIFT) within RFS framework for defect detection has promising performance.

Fig. 1 :Fig. 3 :Fig. 2 :
Fig.1:The general proposed approach for RFS-based defect detection.The highlighted part is the only part that can be trained.
Figure ?? shows the keypoints of different feature extraction.

TABLE I :
Performance results of Poisson RFS-based defect detection using different feature extractions methods on MVTec-AD dataset.The ratio of corrected classified normal samples (top row) and ratio of corrected classified abnormal samples (bottom row) are given.The best means are in bold.

TABLE II :
Performance results of Poisson RFS-based defect detection on MVTec-AD dataset and other .The ratio of corrected classified normal samples (top row) and ratio of corrected classified abnormal samples (bottom row) are given.The best means are in bold.