Automatic Vehicle License Plate Recognition Using Optimal K-Means With Convolutional Neural Network for Intelligent Transportation Systems

Due to recent developments in highway research and increased utilization of vehicles, there has been signiﬁcant interest paid on latest, effective, and precise Intelligent Transportation System (ITS). The process of identifying particular objects in an image plays a crucial part in the ﬁelds of computer vision or digital image processing. Vehicle License Plate Recognition (VLPR) process is a challenging process because of variations in viewpoint, shape, color, multiple formats and non-uniform illumination conditions at the time of image acquisition. This paper presents an effective deep learning-based VLPR model using optimal K-means (OKM) clustering-based segmentation and Convolutional Neural Network (CNN) based recognition called OKM-CNN model. The proposed OKM-CNN model operates on three main stages namely License Plate (LP) detection, segmentation using OKM clustering technique and license plate number recognition using CNN model. During ﬁrst stage, LP localization and detection process take place using Improved Bernsen Algorithm (IBA) and Connected Component Analysis (CCA) models. Then, OKM clustering with Krill Herd (KH) algorithm get executed to segment the LP image. Finally, the characters in LP get recognized with the help of CNN model. An extensive experimental investigation was conducted using three datasets namely Stanford Cars, FZU Cars and HumAIn 2019 Challenge dataset. The attained simulation outcome ensured effective performance of the OKM-CNN model over other compared methods in a considerable way.


I. INTRODUCTION
The recent developments in intelligent transportation systems (ITS) and Graphical Processing Units (GPU) led to major attention being bestowed upon Automatic Vehicle License Plate Recognition (VLPR) in several research domains. LPR is considered to be highly significant in various applications like unmanned parking fields, security management of unattended regions as well as traffic safety The associate editor coordinating the review of this manuscript and approving it for publication was Amr Tolba . administration [1]. Unfortunately, these operations are tedious in nature due to distinct format of plates and dynamic outdoor illumination constraints namely background, brightness, vehicle's speed and distance between the camera and vehicles at the time of image acquisition. Hence, many techniques can be implemented with restricted rules like permanent illumination, low vehicle speed, allocated paths, and static background.
A common method for license plate recognition (LPR) is comprised of four blocks such as acquisition of a vehicle image, license plate (LP) localization, segmentation, VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ character classification and standardization, as well as character analysis. The location procedure process is considered to be highly complex throughout the mechanism, due to the fact that it has a direct impact on accuracy and efficiency of the consecutive procedures. Therefore, it is highly critical to resolve the issues in the existence of illumination conditions and some other tedious backgrounds. A number of developers presented massive approaches to place the LP like edge prediction model, usage of line sensitive filters for extracting plate regions, window scheme, and arithmetic morphology approach [2]. Though the predefined models are capable of processing the position of LP, it is comprised of formidable demerits like sensitivity to illumination, higher computation time, and absence of versatility to be applied in diverse platforms.
Character segmentation was attained in the previous study under the application morphology, relaxation labeling, as well as linked components [3]. Additionally, it has been composed with a maximum count of character analyzing methodologies as reported in the literature [4] such as Baye's classification, Artificial Neural Networks (ANN), Fuzzy C-Means (FCM), Support Vector Machine (SVM), Markov chain model, and K-Nearest Neighbor (kNN) classifier. Even though these methods are able to compute the task of placing an LP segmentation and analysis, several models perform only on individual line character segmentation and two kinds of character analyses were established namely, English and numerals. The highly tedious LP recognition techniques and other different types of character analyses have not been explained.
Several researchers have begun to focus on LPR which works on LP localization, segmentation and character recognition. Therefore, effective placement of LP system is meticulous while the extensive dissection of single part requires a one to perform a task in a combined manner. This paper presents an effective DL-based VLPR model using optimal K-means (OKM) clustering-based segmentation and CNN-based recognition called OKM-CNN model. The proposed OKM-CNN model has three main stages. During first stage, LP localization and detection process take place using Improved Bernsen Algorithm (IBA) and Connected Component Analysis (CCA) model. Subsequently, OKM clustering with Krill Herd (KH) algorithm get executed to segment the LP image. Finally, the characters in LP get recognized with the help of CNN model. An extensive experimental investigation was conducted using three datasets namely Stanford Cars, FZU Cars and HumAIn 2019 Challenge dataset. In short, the paper contributions are summarized as follows.
• Perform LP localization and detection process using IBA and CCA models The upcoming sections of the study are developed as follows. Section 2 elaborates the existing works related to LP detection, character segmentation, and recognition. Section 3 introduces the proposed OKM-CNN model. Section 4 details the experimentation part and the conclusion is drawn in the section 5.

II. RELATED WORKS
In this section, a survey of the existing works was undertaken in a three-folded manner namely LP detection, character segmentation, and recognition.

A. LP DETECTION TECHNIQUES
Lin et al. [5] devised a new technique to detect LPs that can be primarily applied to predict the vehicles and find LP for vehicles so as to minimize the false positives on plate prediction. Since the deep learning models finds applications in various domain such as industrial internet of things, image classification, medical diagnosis, and so on. In this view, the scope of character recognition value got increased for blurred and noisy images when CNN was applied. Ullah et al. [6] concentrated on predicting LP based on the mathematical morphological attributes. The newly presented model was capable of working every English LP that differs in shape and structure. Omran et al. [7] projected an automated LP analyzing mechanism by applying Optical Character Recognition (OCR) along with templates mapping and correlation technique for plate detection.
Babu et al. [8] implied a set of four major steps for LP analysis. In the beginning, during preprocessing, the images were gathered from cameras, modified for proper brightness, noise was eliminated and transformed to a grayscale image. Then, the edges in the image were used to extract the LP location. Furthermore, the characters were segmented in LP. Consequently, it was used with template matching technique to analyze every character in LP image. Rana et al. [9] defined numerous detection approaches for LP and related the function on same metrics. Further, this technique was applied with signature analysis along with CCA, and Euclidean distance transformation. Hence, the models discussed above have been used to attain better accuracy, yet failed because of improper illumination as well as blurring.

B. CHARACTER SEGMENTATION
Fernandes et al. [3] presented a k-means clustering algorithm for the segmentation of LP characters and then connected components labeling analysis (CCLA) algorithm is used to identify the connected pixel regions and grouping the suitable pixels into components for the extraction of every character in an effective way. Liang et al. [10] employed a novel wavelet Laplacian technique to segment the characters randomly from video text lines. It searches for zero crossing points to explore space among words as well as characters. The function of this model was able to attain only minimum performance when an image is filled with noisy background. Also, few approaches were projected for character segmentation present in LP images. Khare et al. [11] developed a novel sharpnessrelied model to segment the characters of LP images. The model encountered gradient vector and accurateness for segmenting operation. Therefore, this approach is cited to be more responsive in improving point selection as well as blur existence.
Kim et al. [12] deployed an effective model for LP detection at different illumination platforms. This model was utilized with binarization and super pixel paradigm to segment the characters in LP. The model mainly aims on a particular reason; however, it does not exist for many reasons. Meanwhile, Dhar et al. [13] projected a system deployment for LP recognition under the application of edge detection and CNN. The model was consumed with character segmentation in the form of pre-processing phase for LP analysis. In case of character segmentation, the newly-developed technique finds edge prediction, morphological task and regional features. Thus, it is more effective for images with elegant backgrounds, but the images showed no impact by the comprised complexities.
Ingole and Gundre [14] employed character feature-relied vehicle LP prediction and recognition. Initially, the model performed the process of segmenting characters from LP regions. In case of character segmentation, the technique presented vertical as well as horizontal projection profilecentric features. The presented features could be ineffective for images with difficult backdrops. Radchenko et al. [15] applied a character segmentation technique according to CCA. The CCA performs quite well if the input image has been binarized in the absence of character shapes and presence among the characters. Hence, the images composed of complex backgrounds are very difficult to deploy a binarization model which divides foreground and background data. Finally, it has been revealed that more number of approaches have been attempted to resolve the issues with less illumination effects. However, it does not enclose alternate complications like blur, touching and difficult backgrounds. Also, there are no models exist with redevelopment for character segmentation from LP images.

C. CHARACTER RECOGNITION
Raghunandan et al. [16] projected a Riesz fractional-centric approach to enhance LP detection and recognition. This approach is used to report the reasons that affect LP discovery and identification. According to the experimental outcome, it is pointed out that the improvement in LP images might enhance the recognition of simulation outcome, which is unsuitable for real-time domains. Al-Shemarry et al. [17] applied an ensemble of Adaboost cascades for 3L-LBPs classification in LP recognition, particularly for low quality images. In this model, it has been identified that the texture attributed depends upon the LBP task and it applies a classification model for LP analysis from the images influenced by diverse factors. Therefore, the function of this technique is based on learning and count of modelled instances. Additionally, the value has been restricted in text prediction; however, the recognition is not terminated from the developed approach. Text detection is simple when compared with recognition, because of the detection process which does not acquire complete shapes of characters. In recent times, a robust ability as well as discriminating energy of DL methods along with few technologies were developed with various DL approaches for LP recognition.
Dong et al. [18] implied a CNN-oriented technique for automated LP recognition. It is explored with R-CNN for the purpose of LP analysis. Bulan et al. [19] established a segmentation and annotation-free LP recognition along with deep localization as well as error identification. In line with this, it has been found that the CNNs detect a collection of candidate regions. After this, it extracts false positive from candidate regions according to robust CNNs. Yang et al. [20] introduced a Chinese vehicle LP recognition with the application of kernel-based Extreme Learning Machines (ELM) with deep convolutional features. Then, the study explored the integration of CNN and ELM which were applied in LP recognition. These features were identified from DL modules, which perform well in case of presence of massive number of predetermined samples. But, it becomes highly difficult to select the predefined instance which shows feasible differences from LP recognition, only for the images influenced by several adverse factors. Also, DL method is constrained with shortcomings like parameter optimization for different databases and retention of reliability of DNN. It is clear from the predefined conditions that there exists a gap in earlier models and the recent requirements. This observation provided more attention to project a novel approach for LP recognition with no dependency of classification models and more count of labelled samples, as present in the previous approaches.
Yousif et al. [2] presented a new LP recognition model based on optimized neutrosophic set (NS) using genetic algorithm (GA). Initially, edge detection and morphological operations are performed for LP localization. Besides, GA is applied to extract the important features to optimize the NS operations. It is depicted that the utilization of NS reduces the indeterminacy on the LP images. Furthermore,

III. THE PROPOSED OKM-CNN MODEL
The overall working principle of the OKM-CNN model is depicted in the figure 1. At earlier stage, the LP localization and recognition process take place utilizing IBA and CCA model. After this, the characters in the LP get segmented using OKM algorithm in which the K-means clustering with KH algorithm is incorporated. At last, the CNN-based character recognition process takes place to recognize the characters present in LP.

A. LP LOCALIZATION AND RECOGNITION PROCESS
The dissemination of diverse locations on a LP image varies, according to the rule of a plate and the impact of lighting platform. The binary model with global threshold is not capable of producing convincible outcome from adaptive local binary technique that has been employed. The local binary  techniques are referred as an image that would be classified into m × n blocks, and every block is computed using binary model. In this study, two local binary methodologies were applied namely local Otsu and enhanced BA which is one of the conventional binary approaches that can be employed on all sub-blocks. Therefore, the function of Otsu is a contingent on illumination constraints that has drastic variation. In order to overcome the irregular illumination barrier, especially for shadow images, a new binary technique such as the enhanced BA was utilized in this research.

1) IBA-BASED LP LOCALIZATION
As the computed LP is attained by different illumination cases and difficult backgrounds, the shadows and irregular illumination could not be eliminated from LP. Therefore, shadow or uneven illumination removal is an essential procedure in the presented model. The binary results states that the conventional binary models are not capable of eliminating the shadow while the LP could not predict and segmented. This problem can be resolved for which a novel binary technique was presented referred as IBA.
In BA, both target and background have to be divided using a histogram that shows a bi-modal pattern of images. The global thresholding binary techniques attain optimal results like Otsu and average grayscale value. Therefore, the realworld images are composed of noise and alternate causes while the image histogram could not produce bimodal pattern. At this point, conventional binary models are not capable of accomplishing the desired final outcome. Local threshold approaches are typically employed in finding critical interference of an image, namely Bernsen Algorithm (BA) and Niblack technique. Generally, to attain the optimal result of local binary model, BA is the best solution to resolve the issue of poor illumination.
Assume that f(u, v) is a gray value of point (u, v). The block is composed of a point (u, v) and size is (2w + 1) × (2w + 1). The threshold T(u, v) of f(u, v) is determined by Then, the binary image can be obtained by, Elimination of noise as well as conservation of these characters is highly significant in this mechanism. Assume that f(u, v) is a gray value attained through Gaussian filter, σ implies the scale of Gaussian filter, and z and l signify the metrics of a window.
where α refers the variable to modify the trade-off between BA with Gaussian filter (α ∈ [0, 1 When α is similar to 0, the projected model is a BA. When α is similar to 1, the BA technique is deployed using a Gaussian filter. Under the application of proper α, the shadow has to be avoided in an effective manner and as a result, the characters could be identified profitably.
• Use the Wiener filtering technique to eliminate the noise. This model could accomplish a better binary outcome, and it is pointed out that the strokes of Kana characters are maintained with some pixels.

2) LP DETECTION
After computing LP using binary technique, the system emerges into LP prediction. The simulation outcome of a prediction remains the key to follow overall performance. CCA is a popular model in image processing which screens an image and labels the pixels to units on the basis of connection between pixels that has been linked with one another. After the groups are determined, every pixel is designated with a value based on the component. According to the LP data, two kinds of LPs are projected as following: • Black characters in a white backdrop • White characters in a black backdrop Here, two detection models were used: initially, the recognition of a white frame using CCA, and second one is the detection of black characters by applying CCA. When LP type is unknown, then the LP detection models tend to produce two kinds of patterns: Procedure 1: LP location with a frame. Candidate frames are analyzed on the basis of advanced knowledge of LP. When procedure 1 is applied, few candidate areas can be forecasted.
Procedure 2: LP location with no frame. If LP could not be predicted, then procedure 2 has to be applied. It is used in predicting the plate by massive extraction. When LP with white characters or black backdrop is unpredicted, the reverse image is examined.
At the initial stage, to reach the linked components of same size, few parameters are applied according to the characters' previous data like pixel value of linked component, width higher than 10, height of more than 20, ratio of height-towidth is lower than 2.5 or higher than 1.5 etc., Similarly, the maintained linked units are of same size. Then, to eliminate few non-characters connected components, alternate limitations are generated by character position on LP namely distance among two characters angle.
Generally, procedure 1 requires lower time in comparison with procedure 2. The application of procedure 1 or procedure 2 in the candidate frame might provide maximum frames simultaneously. In case of candidate frames accomplished by procedure 1, CCA is applied to obtain massive numeric characters and to process the penetration time at midpoint of the LP to attain few candidate frames. The number of penetration times implies the number of modifications from black pixels to white pixels with a midline. The residual frames are not capable of arriving at final decision. The candidate frames can be forwarded by the given steps and discriminated against true LP according to analyzed results.
The absolute value is shown by δ(u, v) in which the measure of I(u, v) is used to calculate the indeterminacy of A NS (u, v). The NS image entropy involves the addition of three sets as shown in the following equations.
Three entropy subsets are depicted as (E TR , E I , E F ).The probabilities of elements present in three MFs are demonstrated by (P TR (p), P I (p), P F (p)). Additionally, the deviations of F and TR develop the elements distribution and entropy of I to create F as well as TR associated with I. The local mean task for a grayscale image A is defined by [2]: The α-mean for neutrosophic image A NS is A(α) = A(TR(α), I(α), F(α)), where TR(α),I(α) and F(α) are shown as given below: where 'b' is the size of average filter. The entropy of I is improved by obtaining a uniform dissemination of elements, where α value is optimized under the application of KH algorithm.

C. OPTIMIZATION USING KH ALGORITHM
The optimum value of (α) is used to compute by applying KH algorithm. KH algorithm is stimulated from the herding nature of krills, which depends upon the individual outcome of krills. The swarm of krills hunts for foodstuff and communicate with swarm members of the technique. A collection of three motions in which the position of a krills are presented here • Endeavor persuaded by other krills, • Foraging act and • Physical diffusion. KH is treated as Lagrangian model as given in the Eq (23) where N p indicates the movement of other krills, Fp is the searching movement whereas D p is the physical distribution.

The optimization Fitness Function (FF) is jaccard (JAC)
which is said to be statistical value that estimates the union ∪' as well as intersection ∩' operators of 2 sets. The fitness JACs are expressed by: where, A rf denotes the computerized segment area by applying the presented ONKM model, and A rq refers the ground truth region. The ONKM (LP) character segmentation method is used to attain the α optimal. In order to attain higher JAC coefficient using KH, the Eq. (25) is applied.

D. OKM BASED CLUSTERING PROCESS
K-means is defined as the clustering method that consolidates the objects into K groups. The arithmetical function establishes the k-means: where, q implies the overall cluster count, Z q denotes the center of q th cluster, and d q represents number of pixels of q th cluster. From k-means algorithm, it is essential to reduce O using the given constraint: where the dataset, W = {w p , p = 1, 2, . . . , n}, w p signifies sample in d-dimensional space and C = {C 1 , C 2 , . . . , C q refers the partition W = U q p=1 C p . Once the optimization is completed, α, (T and I) subsets are named as new values while the consequence of indeterminacy is computed as: Then, it is applied with k-means clustering for optimal NS to a subset (TR).

1) CONV LAYER
It varies from one NN to another in such a way that not each pixel is linked to subsequent layer with weights and biases. However, the whole image gets partitioned into smaller parts while the weights/biases are employed into it. These are called as filters or kernel which are convoluted with each smaller region in the input image and offers feature maps as output. The filters are considered as simpler 'features' which undergo search in input image and in conv layer. The parameters, needed to perform the convolution function, are less since similar filter gets traversal over the whole image in an individual feature. The filter count, local region size, stride, and padding are the hyper parameters of convolution layer. With respect to sizes and genres of the input image, these hyper parameters undergo tuning to achieve effective performance.

2) POOLING LAYER
In order to reduce the spatial dimension of the image and parameter count, pooling layer is utilized to minimize the processing cost. It carries a predefined function over input and therefore no parameters are devised. Various kinds of pooling layers exist namely average pooling, stochastic pooling and max pooling. Here, Max pooling is utilized where the nxn window is slided over the input with stride value s. For every location, the maximum value in the nxn region is considered and consequently the input size gets reduced. It offers translational invariance so that a minor difference in location can also be recognized.

3) FC LAYER
Here, the flattened output of the final pooling layer is provided as input to a FC layer. It acts as a classical NN, in which  each neuron of the earlier layer is linked to the current layer. Therefore, the parameter count in the layer is higher than the conv. layer. It is linked to the output layer commonly known as classifier.

4) ACTIVATION FUNCTION
Various activation functions are employed along different architectural models of CNN. The nonlinear activation functions called ReLU, LReLU, PReLU, and Swish are available. The nonlinear activation function assists to speed up the training process. In this paper, ReLUs function is found to be effective over other ones.

IV. PERFORMANCE VALIDATION A. IMPLEMENTATION DETAILS
The presented OKM-CNN method was accelerated by applying in a PC with configuration such as i5, 8 th generation and 16 GBRAM. The OKM-CNN approach is processed under the application of Python language with TensorFlow, Pillow, OpenCV and PyTesseract. Figure 3          DEEPAK GUPTA received the Ph.D. degree from Dr. A.P.J. Abdul Kalam Technical University. He has completed his first postdoctoral from Inatel, Brazil. He is currently a Postdoctoral Researcher with the University of Valladolid, Spain. He is also an Assistant Professor with the Maharaja Agrasen Institute of Technology (GGSIPU), Delhi, India. He is also an Eminent Academician; plays versatile roles and responsibilities juggling between lectures, research, publications, consultancy, community service, Ph.D., and postdoctorate supervision. With 12 years of rich expertise in teaching and two years in industry; he focuses on rational and practical learning. He has contributed massive literature in the fields of human-computer interaction, intelligent data analysis, nature-inspired computing, machine learning, and soft computing. He has actively been part of various reputed International conferences. He is not only backed with a strong profile but his innovative ideas, research's end-results and notion of implementation of technology in the medical field is by and large contributing to the society significantly. He has authored or edited 33 books with National/International level publisher (Elsevier, Springer, Wiley, and Katson