Banknote Classification Based on Convolutional Neural Network in Quaternion Wavelet Domain

In this paper, we propose a new framework for banknote classification based on quaternion wavelet transform (QWT) and deep convolutional neural network. Firstly, the QWT is applied to describe the phase and magnitude of different banknote images which has inherent directional sensitivity and multi-scale framework. Then we design a deep convolutional neural network which is trained on banknote images along with the magnitude and phase of quaternion wavelet coefficients. We assign the neural weights on the output probabilities of deep convolutional neural network and update these weights by utilizing the back propagation algorithm. Finally, the trained networks with decision of a weighted sum and the magnitude and the phase of quaternion wavelet networks are utilized for banknote image classification. The performance of our algorithm is experimentally verified on a variety of banknote databases. Experimental results show that the proposed algorithm achieves superior performance compared with other state-of-the-art banknote classification algorithms. The proposed algorithm can also satisfy the real-time requirements of the banknote sorting system.


I. INTRODUCTION
A large amount of financial transactions and cash transactions are conducted between countries such as US dollar and European Euro which are widely used in the world. The automatic banking and counting machines are used in many applications. Hence, the accurate and robust banknote classification algorithm is important for these systems.
In the recent years, the traditional algorithms for banknote feature extraction and classification mainly include geometric feature [1], grid feature [2], free mask [3], and wavelet features [4]. In [5], the statistical model is used to simulate the degradation process of banknote image in the circulation, which can significantly improve the banknote recognition performance. In [6], the quaternion wavelet is used to analyze the statistical characteristics of the banknote image, and a statistical model of the banknote image is The associate editor coordinating the review of this manuscript and approving it for publication was Dezhong Peng. established. In recent years, with the further improvement of computer performance and GPU, deep convolutional neural network (DCNN) [7], [8] has achieved rapid development which has the characteristics of sparse connections, weight sharing, multi-level feature extraction, and good robustness to displacement changes and so on. The DCNN not only broke through the technical bottleneck of traditional image feature extraction algorithms but also broke through the technical bottleneck of object recognition and classification. The DCNN can consider the process of feature extraction and classification as a whole, which can automatically extract the most effective features during the stage of training.
In [9], the AlexNet applied rectified linear unit (ReLU), dropout, and local response normalization (LRN) to the CNN, which can obtain better results in banknote image feature extraction and classification. The VGGNet [10] with small convolution kernel of 3 × 3 and max pooling layer of 2 × 2 is proposed to explore the relationship between the depth of convolutional neural networks and the performance which VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ has better generalization. In [11], inception net is proposed to control the amount of calculation and the amount of parameters. The improved generative adversarial networks are applied to image in-painting in [12]. In [13], the residual neural network (ResNet) trained 152 deep neural networks by using residual units, which can not only extract deeper features of the image, but also learn the image content better.
On the other hand, the sparse representation model is an effective tool for image processing. The success is due to the fact that the image signals have sparse representations with respect to fixed bases. In recent years, the new sparse representation models based on quaternion wavelet transform [14]- [16], which is shift invariant and consist of one magnitude and three phases. Two phases denote local image shifts and the third phase can describe the image texture information. The quaternion wavelet transform is successfully applied in edge detection [17], image registration [18], and classification [19]. Then the quaternion wavelet based methods have extended to vector signal processing which are color image analysis [20], color image de-noising [21], color image registration [22], and recognition [23].
However, the traditional banknote feature extraction algorithms cannot explore the banknote image texture information. Meanwhile, they are time consuming and difficult to extract discriminative feature of banknote image. Hence, the application of data-driven convolutional neural networks [14], [24] can extract features from data distribution through multiple processing layers without preprocessing and have strong generalization ability. In order to solve the problem described above, the proposed joint framework based on quaternion wavelet transform and convolutional neural network can exploit the banknote image features effectively and obtain higher classification rate. The structure diagram of the proposed algorithm is illustrated in Figure 1. The contributions of this paper are as follows: We applied the QWT to describe the texture structure information of different banknotes which can further explore the inherent directional sensitivity by using the decomposed phase and magnitude. The improved deep convolutional neural network for banknote image recognition is proposed for the first time. The magnitude and phase information of QWT decomposition coefficients are applied in the training phase. Then the neural weights have assigned on the output of the proposed improved CNN and updated by using the back propagation algorithm. Finally, the classification results are obtained by using the trained networks with decision of the neural weighted summary. The proposed banknote image recognition algorithm obtains the superior performance compared with the state-of-the-art algorithms for different denominations from four countries.
The rest of the paper is organized as follows. In Section 2, we give a brief introduction of the quaternion wavelet transform. The specific implementation of the proposed algorithm is given in Section 3. In Section 4, we present the experimental results of the proposed algorithm comparing with some existing traditional algorithms. Finally, the conclusion and future work are given in Section 5.

II. RELATED WORK
In this section, we will review the basic concepts of quaternion wavelet transform. As it was shown in [25], the quaternion wavelet transform of 2-D function f (x, y) can be represented as, where the superscript q denotes the quaternion signal, A q n denotes approximation coefficient matrix, and D q j,m are detailed coefficient matrices at decomposition level j for directions m = H , V , D which are horizontal, vertical, and diagonal. Let ψ q m (x, y) denote wavelet functions and ϕ q (x, y) denote scaling functions, which are defined as follows, According to Eq. (2) and Eq. (3), we can obtain each approximation function A q j f (x, y) and the detailed components func- The procedure of QWT is applied scale and wavelet filters to the rows and columns of the signal by subsample factor of two. Then the real part of the approximation at decomposition level j is taken as input at the next level, and the process repeats through the all decomposition levels. Then inspired by [26], we applied Fourier implementation for high order filters design, which are defined as, where H q i and G q i are quaternion high pass filter and quaternion low pass filter respectively, which are symmetric in the space domain. The Hilbert transform is approximated by ω ∈ [−π, π] and dyadic dilation rule σ i = 2 i−1 . At each scale, the sub-bands are normalized by the energy of equivalent filters.
The quaternion wavelet can describe the 2-D analytic signal, which is the extension of wavelet transform and complex wavelet transform. The analytic signal at different scale is obtained by embedding Hilbert transform directly into the quaternion wavelet transform framework. Hence, quaternion wavelet is able to make multi-resolution analysis for image signal. Compared with complex wavelet, quaternion wavelet filter has no direct current response and avoid the forward bias effectively. The complex wavelet has translation invariant. Hence, the quaternion wavelet is translation invariant due to the fact that it can be viewed as tensor product of complex wavelet.

III. OUR METHOD
The magnitude and phase information of banknote image by using quaternion wavelet transform are extracted in the proposed neural network. The proposed algorithm can learn the extracted feature more effectively which can describe the texture information of banknote image. The proposed algorithm is employed to extract and learn their most representative features from the banknote images. The specific structure of the core convolutional neural network is illustrated in Figure 2. The framework of the proposed convolutional neural network is setting to nine layers, which include convolutional layer and global average pooling layer. The specific design details are as follows, (a) Convolution layer. There are eight convolutional layers, which are divided into two categories. The first category contained six convolution layers with convolution kernel size of 3×3, and we set the stride size of 1×1, padding = 'SAME' during the convolution process. The other category included two convolutional layers with convolution kernel size of 2×2, and stride size of 2×2, padding = 'VALID'. They are mainly used to extract features of different levels from the input data, and then the extracted features have been nonlinearly mapped through nonlinear functions. The rectified linear units are used as the activation function of each layer in the whole network. At the same time, batch normalization was added between each convolution operation and activation function to normalize the convolved data, which improved the training efficiency and reduced the over fitting more effectively.
(b) Global average pooling layer. The global average pooling layer is employed after the last convolutional layer to average the output features of the final convolutional layer. The pooled kernel size is 32 × 32, and the size of the obtained final output vector is setting as the 1 × 1 × 512, which can greatly reduce the parameter amount and the computational complexity. Meanwhile, it can also improve the training efficiency and performance of the network.
The magnitude and phase information of the original banknote image are used as input of the proposed convolutional neural network. The neural network extracted and learned features of the original images, the amplitude and the phase with different proportions and directions. Then we assign weights to different types of features set extracted by each parallel convolution network and learn those weights through back-propagation and calculate the corresponding loss values through soft-max and cross entropy functions. Finally, the overall classification accuracy is calculating by using a weighted sum of the decisions from the CNNs. The whole process is shown in Figure 3. The forward and back propagation algorithms are used to train our deep neural networks in this paper. In the training process, the network performs two operations per iteration. The first operation is the forward propagation of the network to calculate the network prediction given the inputs. The other operation is back propagation, which can minimize the target classification errors by further updating and optimizing weights and biases of network. In the process of forward propagation, the network prediction is calculated by summing the inputs, weights, and biases. Hence, we should calculate the total net input to each hidden layer neuron and pass it through the logistic function to calculate the output as follows, where I c2 , O c2 , I c3 , O c3 are computed in the same way. Then, we also need to calculate the net input and output of neurons in the output layer of our neural network as follows, The total output error is computed by summing the squared error, which is given as, where y t denote expected output, and y o is actual output, E o1 and E o2 are output errors. Then, In the process of the back propagation, we calculate the partial derivative of the total error with respect to each weight, according to the following chain rules, where λ is the learning rate, for the same way we can find other output layer weights w11, w12, w13, w14, w15. Then the hidden layer weight is computed as follows, where w2, w3, w4, w5, w6, w7, w8, and w9 can be obtained by the similar way described above. Then, the minimum value of objective function is obtained by optimizing the weight values. The weight decision algorithm mainly assigns decision weights to different types of features set, and then updates these weights through back propagation algorithm. Finally, the classification accuracy of the model is calculated by the weighted fusion from each convolutional neural network. For the soft-max classifier, the original neural network output denotes as y i (i = 1, 2, · · · , n). The output after soft-max regression processing is defined as follows, n j=1 e yj (15) where the soft-max function with probability weight vector w is defined as follows, The corresponding probability weight loss function is given as, H = − n y n log w − n y n log y i (17) where the probability weights w by back-propagation were updated by using the following rules, Then, the three neural networks were used to simultaneously train the decomposed components of banknote image. Their weights are updated as follows, where w o is the banknote image, and w M , w P denote the magnitude and phase of banknote by using the QWT. Hence, the corresponding normalized weights are defined as follows, where w no , w nM , w nP are the normalization of w o , w M , w P , and which satisfy the relationship of w o + w M + w P ≈ 1.

IV. EXPERIMENTAL RESULTS AND ANALYSIS A. DATASETS
The proposed banknote classification algorithm was tested using different banknote image databases from various countries which are US dollar (USD), European Euro (EUR), Russian Ruble (Ruble), and Chinese Yuan (CNY). Each banknote from different dataset with various face values has four directions. Hence, the number of the banknote images and classes are illustrated in Table 1. Figure 4 shows some examples of the banknote images of different countries.

B. EXPERIMENTAL ENVIRONMENT SET
As the proposed algorithm need a mass of data including matrix operations and image processing units, the process of training was finished on the GPU. Inspired by [28], the probability weights of each feature set were initialized as 1 3. At the same time, the Adam optimization algorithm is utilized to optimize the proposed network. In order to get a better optimization, we use Adam default parameter settings. The learning rate is setting as ξ = 0.001, and the exponential decay rate of the moment estimation ρ 1 and ρ 2 are 0.9 and 0.999, respectively. The input data of the network is employed to improve the efficiency of the network, which the minibatch was set to 128. The iterative training was terminated when the test error of the model on the validation set fluctuates around a certain value within three consecutive epochs. Through experiments we finally performed 4000, 3500, 3500, and 3800 iterations on the Ruble, USD, EUR, and CNY.

C. PERFORMANCE EVALUATION
The first experiment is conducted to reflect the changes of the probability weights on USD with the increased number of training iterations. The experimental results in Table 2 show that the weight assigned to the original image accounts for a large proportion. The similar experimental results were obtained by using the other datasets which are Ruble, EUR, and CNY, respectively. Hence, it can be concluded that the original image plays an important role in the training process, and the corresponding magnitude and phase of QWT can also play an indispensable role in improving the classification accuracy of the model.
The change trends of classification rate of the proposed algorithm with the increasing training iterations on different datasets are illustrated in the second experiment. The proposed algorithm was compared with the other six traditional algorithms such as free Mask [3], DWT [4], VGGNet19 [10], PReLU-net [27], BN-inception [28], SAGP [29], and ResNet [13]. For the ResNet network, 128 layers and 4 residual blocks are built in this experiment. It can be seen from the Figs. 5-8 that the average classification accuracy of the        BN-inception [26], SAGP [29], and ResNet-28 [13], respectively.
The two advantages of the proposed banknote classification framework are given as follows. Firstly, the QWT is applied to extract the texture and phase features of the banknote images. Secondly, the improved deep convolutional neural network is utilized to make fast and accurate classification of banknote images with different size.

D. PROCESSING TIME
In the last experiment, the processing time of the proposed algorithm was calculated on four banknote datasets with Intel(R) Core(TM) i7-4790 CPU 3.60GHz and Nvidia Titan X GPU. Table 6 shows the processing time of the proposed algorithm on each dataset which are 0.98ms, 2.90ms, 1.51ms, and 1.78ms for Ruble, USD, EUR, CNY respectively. The experimental results show that the elapse time of the proposed algorithm is significantly lower than VGGNet19 [10], PReLU-net [25], BN-inception [26], SAGP [29], ResNet28 [13], DWT [4], and equivalent to Mask [3]. Hence, it is concluded that the proposed algorithm can obtain the best balance between computation complexity and the banknote image recognition rate.

V. CONCLUSION
In this paper, we propose a joint framework based on quaternion wavelet transform and convolutional neural network for banknote images classification. We applied the QWT to describe the phase and magnitude information of the different banknote images. Then we trained our deep convolutional neural network according to banknote images along with the magnitude and phase of quaternion wavelet coefficients. We assign the neural weights on each feature set decision and update these weights via back-propagation algorithm. Finally, we calculate the classification accuracy using a weighted sum of the decisions from banknote images, magnitude and phase. The experimental results show that the proposed algorithm effectively improves the accuracy of banknote image classification and has better generalization ability than the traditional banknote image classification methods. The proposed algorithm can extend to color banknote image classification which XIANG HUANG was born in Jiangxi, China, in 1996. He received the bachelor's degree in engineering of IoT from Jiangxi Agricultural University, Nanchang, China, in 2018. He is currently pursuing the master's degree with Nanchang Hangkong University, Nanchang. His current research interests include machine learning, digital image processing, and quaternion.
SHAN GAI was born in Jilin, China, in 1980. He received the Ph.D. degree in pattern recognition and intelligence system from the Harbin Institute of Technology, Harbin, China, in 2011. He is currently an Associate Professor with Nanchang Hangkong University. His current research interests include pattern recognition, machine learning, medical image processing, and quaternion.