Tools for Fast Metric Data Search in Structural Methods for Image Classification

The article proposes a new classification method based on implementing the high-speed search tools for the indexed data structure created on the etalon set of features, which has significant advantages in processing speed compared to the traditional approaches. The classifier is represented as two-stage processing, where at the first stage the class for the separate object descriptor is determined, and at the second stage, the resulting class of the object is determined based on the obtained set of local solutions. The developed method is based on the preliminary construction of the indexed hash structures for the set of descriptors of the base of the etalon images. Implementing the hash representation allows for increasing the speed of identification or classification of visual objects. A comparative experiment with the traditional method of voting has been conducted, where the linear search for the nearest descriptor has been implemented for the identification without the use of prior creation of the indexed hash representation of the etalons. In the experiment, we have gained in processing speed for the developed method compared to the traditional over 10 times. The gain in processing speed increases proportionally with the number of the etalons and the number of the descriptors in the descriptions. The experiment has shown that the efficiency of the method can be enhanced by varying the values of its parameters and adapting to the properties of the data.


I. INTRODUCTION
Structural methods of image classification implemented for modern computer vision systems are based on data on images of visual objects as the set of keypoint (KP) descriptors [1], [2], [3], [4], [5]. Such methods are the most effective for recognizing images of the fixed structure [2], [3]. The KP descriptor is the vector of size of 64. . . 512 binary components that are the approximation for the fragment of the image brightness function [4], [6]. The traditional image classifier is based on the metric criteria of the relevance of the view ''set-set'' between the The associate editor coordinating the review of this manuscript and approving it for publication was Ikramullah Lali.
images of the recognized object and the etalon and on the optimization of the value of relevance for the etalon database [2].
The class of the recognized visual object is defined as the infinite set of images of the object, taking into account its allowable geometric transformations: shifts, rotations, and scaling within the field of view of the recognition system [1], [2], [3], [4]. The class is represented by the etalon image, according to which the computer system determines the image (set of features) of the etalon. The set of user-selected etalon images creates a finite database of samples based on which objects have been classified. It involves assigning the input image to one of the etalon classes or abandoning the classification [1], [2].
More universal to provide the noise immunity is the implementation of element-by-element analysis for the set of the components of description, which based on the formation of statistical solutions for the holistic image of the object allows for reducing the impact of interference [2]. For the method of component analysis, the traditional approach of structural classification is to determine for each descriptor from the description of the recognized object a subset of descriptors of the base of etalons relevant to it in the complete set and to count the number of descriptors -components of the object assigned to each class. This procedure implements the voting process. The degree of relevance is determined on the base of the calculation of the metric value.
Taking into account that the number of descriptors in the description of the separate etalon reaches 300. . . 500, and classes can be tens or even hundreds, this approach leads to the procedure of complete search and requires quite cumbersome calculations [3], [4], [5].
The implementation of data hashing tools significantly reduces time costs in the process of practical realization of classification procedures based on metric search [5], [7], [8]. During hashing, the data of the etalons database are pre-decomposed into groups of similar elements according to some feature, which significantly (thousands of times) speeds up the processing process due to the corresponding increase in memory requirements [5], [9], [10], [11]. Hashing implements the promising idea of reducing the dimensionality of the analyzed data by pre-constructing for the etalon set of some indexed structure, which simplifies processing and reduces the computational costs of classification.
In articles [12] and [13] the approximate methods of metric search according to the sample based on the creation of the indexed data structures on a set of hierarchical signs of the segmented images are researched. The indexed structure based on clustering for bulky multidimensional data sets has been implemented for professional search systems [14], where the multi-stage clustering scheme with the model of approximate search of relevant elements within the cluster has been used to improve performance. The creation of cluster structures as the modern approach to data mining allows us to better adapt to the content of etalon information, that we used to determine the relevance of visual objects in the technology of ''word bag'' [2], [11], [15], [16]. Algorithmic and firmware tools for fast search in multidimensional data warehouses (e.g., in web applications) have been successfully developed and implemented in modern applied search systems [9], [10], [12], [15], [17], [18], [19], [20], [21], [22].
In article [9] the range of high-speed search tools is studied in detail, among which we can highlight the application of the principle of the combinational integration of several methods in the form of chain: hashing, data division into blocks (design), Locality-Sensitive Hashing processing (LSH-processing) as sorting by the value of the hash function for data blocks, logical analysis (agreed filtering) of the values of the hash function for blocks. The key idea of LSH-processing focuses only on data pairs that may be similar so there is no need to check each of the pairs [9], [23].
Implementing any new tools of analysis or data processing requires paying attention to providing quality indicators for the functionality of the classifier [24], [25]. It is clear that, for computer vision systems, the key criterion for implementing fast search tools is still guaranteed to provide sufficient performance indicators of classification, such as the probability of correct recognition.
Given the urgent requirements to provide the high computational performance of structural methods of image classification while maintaining a sufficient level of their effectiveness, it is necessary to implement into their structure the modern approaches of high-speed search and data analysis used in technologies of processing volumetric arrays of multidimensional information.
The purpose of the work is to increase the computational performance of structural methods for image classification.
The tasks of the research are to create and apply the models of high-speed image classifier based on the construction of data hash structures for etalon descriptions of classes as a set of descriptors of keypoints, to implement data partitioning tools into non-intersecting blocks, to use the Locality-Sensitive Hashing (LSH) and logical processing apparatus. Based on the results of the software simulation, it is necessary to evaluate the effectiveness and the obtained performance indicators for the developed modifications of the classifiers in comparison with the known ones.

II. CLASSIFICATION BY THE SET OF COMPONENTS OF THE STRUCTURAL DESCRIPTION
Let's process the space B n of multidimensional binary vectors of dimension n, in which we will construct the images of the recognized object and etalons. Let's fix some multi-set of vectors E i ⊆ B n as the description of the etalon in the space of KP descriptors, where s = card E i is the power of the set of descriptors [1], [2]. The description components are the vectors e k ∈ B n , a finite set of which creates the object description. The statement of the classification task assumes the presence of some base E of descriptions of the etalons of N -dimension: Let's represent the recognizable object as the set Z ⊂ B n , Z = {z w } s w=1 . Let's set the task of constructing the classifier K in the sense of following representation K : Z → [1, 2, . . . , N ] from the set of description components into the set of class numbers based on the preliminary construction of the indexed structure on the set E. VOLUME 10, 2022 Let's present the K classification by the set of the description components as a two-stage process where in the first step K 1 : B n → [1, 2, . . . , N ] we determine the class d w for each descriptor z w ∈ Z of the object, and in the second step K 2 : D → [1, 2, . . . , N ] from the set D = {d w } s w=1 of the obtained solutions of the components of the first step we form the aggregate conclusion about the class of the object Z .
Such principle of processing corresponds to the generalization of the set of solutions of the homogeneous partial classifiers according to the boosting model [1], [2], [11], [19].
Here the stage K 1 can be considered as the multivalued characteristic function for determining the etalon class E i for the single descriptor from the object description. The stage K 2 represents the final classification solution.
The stages K 1 , K 2 in the classifier (1) can be constructed in other variants, e.g., by creating at the stage K 1 some probability distribution by the etalon classes, including procedures for logical analysis or processing of such distributions [1], [2].
Structurally and informatively, the implementation of K is based on a priori data of the existing database E, as the belonging of all e v (i) to the corresponding image E i within the database is already known at the beginning of the classification.
If the classification K 1 is carried out by the traditional method of the linear metric search (complete search) using the sequential analysis of each of the elements of the set E, then the competition rule is applied [2], [11], [16] where d w is the number of the etalon E i , to which the descriptor z w of the object will be assigned, d w ∈ {1, . . . , N }, ρ(z w , e v (i)) is the metric in the vector space. For effective classification, the implementation of logical analysis for the value of the minimum, achieved in implementing (2), is important [11]. The value d w is determined only in the case if the obtained minimum distance in (2) does not exceed the specified threshold δ ρ : ρ min ≤ δ ρ . Otherwise, the class d w is not defined, and we consider that the analyzed descriptor is false.
The required number Q of the calculated metric values in (1) in the complete database of the etalons using linear search can be estimated by the parameter value Q = N ·s 2 , assuming the volumes of descriptions of the etalons and objects are equivalent.
For the vectors of space B n in (2), the computationally simple Hamming's metric χ can be applied which counts the number of mismatched bits for the vectors [2].
in (3), the logical function 1[. . . ] takes the value 1 if the j-th bits of the vectors z wj , e vj (i) do not match, and otherwise takes the value 0. The idea of the stage K 2 is that at each step according to rule (2) based on the obtained value d w the number r i of votes of the elements assigned to the i-th class is incremented and accumulates the resulting class i 0 for the integral image Z of the object is determined by the maximum number of votes among all classes: The values {r i } N i=1 reflect the class histogram by the number of votes of the elements from the set Z . The equalities (4) and (5) specify the stage K 2 , which consists in processing the obtained votes for the description components Z . When implementing the stage K 2 the use of logical analysis for the value of the maximum number of votes r i (i 0 ) in (5) is also acceptable. We make a positive decision about the object class under the following condition r i (i 0 ) ≥ δ r , where δ r is the threshold for the decisive number of votes.
The thresholds δ ρ , δ r are determined, as a rule, by the conditions of the classification [2].
The proposed two-stage classification procedure is based on the fundamental principle of intellectual analysis, which consists in counting the number of positive decisions (support, attendance rating) on the analyzed data set [19], [26]. The considered variants of constructing the classifier naturally can be interpreted within the theory of ensemble models [2], [19], according to which due to the creation and aggregation of responses of the component classifiers (local decisions) the ''strong'' classifier with guaranteed higher efficiency of decision-making under the boosting model is synthesized [16], [19]. The key computational problem when constructing the classifier for multidimensional databases is the implementation of the search (2), which is a near-neighbor approach in the numerous set of vectors. Let's focus on its solution by applying the two-stage procedure of the classification of the specialized data structure, which aims to reduce the amount of computation Q.

III. DATA TRANSFORMATION BASED ON THE INDEXED STRUCTURE FOR HIGH-SPEED SEARCH
At the stage of the preliminary data analysis, let's create the specialized structure in the etalon set E of images for the highspeed efficient classification.
Let's divide T on the set E of image database descriptors. We obtain the set of M disjoint groups T k (E): Today, there are the two most common application methods for partitioning the multidimensional data: hashing and clustering [3], [5], [9], [11], [26].
The implementation of hashing is associated with the precise tools of the processing; the clustering is related to the approximate methods of self-learning, as its result is not strictly formalized. The transformation (6) practically performs the preliminary classification and saves all sets of the analyzed data, which is now distributed between the groups T k (E). As the result of transformation (6), each descriptor e v ∈ E of the etalon database gets the parameter k of the group number (basket or cluster).

Given the existing initial partitioning
into the etalon images, let's define the following parameter as the number of the i-th class descriptors in the basket T k . Based on the parameter t i,k , we have the weight quantitative The equality (8) defines the statistical distribution of the elements of each data segment by the etalon classes in the form of the weight coefficients of the classes, given The distribution b is the common characteristic for the set of elements of the data segment obtained as a result of the analysis (training) for the database E. At the training stage for increasing the effectiveness of the classification, the implementation of logical processing for the vector value b may be effective for using the weight of the effect of the coefficients b i for the most important classes in the basket [2].
For hashing, the attribute of the separate data basket is the value of the hash function. The important parameter that affects the speed of the search, is the number of baskets M . The smaller the number of the baskets, the higher the speed of the transition to the basket. The larger the number of the baskets, the smaller the amount of the data to analyze inside the baskets. The boundary situations are: 1) The one basket (M = 1).
2) The complete absence of hashing (M = N · S), which corresponds to the linear search.
According to the researchers, it is possible to formulate the task of optimizing the number M of baskets, where the criterion is the number Q of metric calculations for the description components [9], [13], [14], [19].
If the average number of the elements in one basket is N ·s M then the number of calculations according to (2) within the indexed structure using the hash-code values is proportional to Q 1 = s · M + N ·s M , which is much less than Q 2 = N · s 2 for the traditional linear search. For the specific values, namely s = 500, N = 10, M = 10 the gain β = Q 1 Q 2 is 450 times and it increases with increasing amounts of N , s data.
The data processing method using hashing [5], [9] performs the preliminary classification into the separate baskets and has its features. The number of the baskets for this method, as a rule, is specified and is related to the range of hash-key values h(e v ) for hashing specified by the user. As a rule, the key is the integer that takes the finite set of is the set of the hash function values [5], [10].
For the vectors from B n such a key can be, e.g., the number of single bits in the descriptor, the number of the selected combinations of the bits of the arbitrary length, and others. In general, the feature h(e v ) for hashing can be adapted to the values of the descriptions of the etalon database.
When implementing the hashing, let's divide the dataset E into disjoint groups H k (E) (baskets) by the number L with the equivalent value of the hash key Let's now implement the specific chain of the data pre-processing tools to provide the sped-up search ( Figure 1) [11].
Let's apply the block representation of the data of the descriptor set, calculating two values of the hash function h(1), h(2) for each of the descriptors of the etalon base. For the descriptor Binary Robust Invariant Scalable Keypoints (BRISK) [27] of the size of 512 bits, let's calculate the value h(1), h(2) for the first and second half according to the scheme ''256 + 256''.
Next, let's implement the LSH-processing [9]: we sort the etalon set E = {e v } s * N v=1 of the descriptors of the entire database by the value of the hash function h(1) for the 1st block.
Using the logical analysis, let's process the ranges of the values h(1), excluding the ranges insignificant for the existing database of etalon data from the search.

IV. THE CLASSIFIER BASED ON THE SYNTHESIZED STRUCTURE
Let's apply the KP detector to the image of the recognized object. Let's form the description Z = {z w } s w=1 of the object as the set of descriptors. Let's construct the classifier using speed search in the base E based on the created structure.
The proposed method when implementing the chain of the tools includes the following actions ( Figure 2): 1) For any descriptor z * ∈ Z of the input image, let's calculate the value of the hash keys h * (1), h * (2) as the value of the hash function for the argument z * . 2) When determining in the database of the descriptor identical to the analyzed z * ∈ Z with values h * (1), h * (2), let's apply the linear search in the ''data band'' by the value of the hash function h(1) for the 1st block, limited by the specified limit  The proposed high-performance classification method based on implementing the hashing can realize both accurate and approximate types of the search. The parameters are the type of hash function, the accuracy , the number of the blocks and the number of the hash baskets constructed for the data components in the indexed structure. We can control these parameters based on the fullness of the baskets for the specific data.

V. RESULTS OF COMPUTER SIMULATION
The software modeling has been performed in PyCharm Community Edition 2020.2.3 using the OpenCV library and Python programming language. The hardware is the laptop with an Intel Pentium CPU N3540 2.16 GHz processor and 4 GB of Random Access Memory (RAM). The Binary Robust Invariant Scalable Keypoints (BRISK) keypoint detector [27], [28], [29] with the descriptor of n = 512 dimension has been used to determine the KP descriptors. The developed models of the classifier have been applied to the example of images of famous people (politician, artist, and scientist); we scale the size of the images to 500 × 500 pixels. The illustration of the etalon image classes and the generated KP coordinates are shown in Figure 3. The number of calculated descriptors in the description of each of the etalons is s = 500.
The number of units in the binary code of the vector has been taken as the hash function h(. . .) and the accuracy parameter = 13 for both blocks has been used as ±5% of  the size of the block 256, which reflects the tolerance for the value of the hash function.
Given the binary type of the analyzed data, Hamming's metric (2) has been used to compare the vectors. Inside the band of the fixed hash values in the baskets, we have used the traditional linear search in the experiment.
To compare the performance characteristics, we have simulated the software method of linear search on the whole set of the etalon data (1500 descriptors) without using the indexed structures.
The most accurate result has been obtained for the set of the etalon images using the indexed hash structures for 256 baskets and for the linear search: all 500 descriptors of each of the etalons have been correctly assigned to their class. In this case, the processing time without indexed hashing is 500 seconds, and with the implementation of the combination hashing, it is only 49 seconds.
To search for the data band in the sorted set by the value h(1), instead of the linear search, you can implement, e.g., the dichotomy method [10], [18], [30], which will make the gain even more significant.
As we can see, the gain in the computational time for the proposed approach (with 3 etalons and s = 500) is over 10 times compared to the traditional one. Thus it's clear, that the gain in the computation time increases with the number of the etalons and descriptors in the descriptions. In addition, the computation time significantly depends on the created software model, computer type, and method of access to the applied software, so only relative indicators are objective.
Hashing is one of the precise methods of data transformation. It's clear, that the classification error with the implementation of hashing may occur since within the basket of the grouped data the equivalent or close to the hash function descriptors of different classes can be occurred [31], [32].
The effect of the threshold parameter δ ρ for the value of the minimum distance, when deciding on the class of the object descriptor according to rule (1), is important for providing an effective classification.
The experiments with the effect of rotation transformation (the largest distortion of the image, Figure 4) for the input etalon images have shown that the proposed method successfully classifies all etalon images. The histograms of the votes for the first image in Figure 4 contain the values 415, 52, 33 (for the other two transformed images -(104, 330, 66) and (60, 59, 381)), which confirms the reliable classification for each of the classes.
The maximum number of votes for the ''correct'' etalon is over three times higher than the nearest value.
For the more detailed analysis of the quantitative composition of the baskets on the indicators (7), (8), these values have been calculated by dividing the sorted data set into 8 equivalent blocks by the value of the hash function h(1) in the range of 0. . . 255. Table 1 contains the values of the indicator (7) for 8 equivalent ranges of values h(1). Table 1 shows that to further reduce the calculations for classification for this set of the etalons we can perform the following actions: 1) To exclude some ranges (0. . . 63, 192. . . 255) from the analysis. 2) To establish the significant advantage for one class for the following ranges 64. . . 95, 160. . . 191. 3) To apply the developed method for the rest ranges. Table 1 reveals the further possibilities for improving the proposed classification method by considering the specific values of the selected hash function for a fixed data base of etalon images. Table 1 contains the number of hash-function values h(1) that have entered the uniform data ranges (multiples of 31) for the different etalons. The results of Table 1 give an opportunity to not analyze individual ranges of the hash function (0. . . 63) at all during the classification or to determine the class due to a significant advantage for one of the etalons. This preliminary analysis of the etalon data will further enable us to reduce the classification time.
In all conducted experiments, the classification has been performed correctly, i.e. all input images with transformations correctly have received their class number. The obtained results confirm the effectiveness of the developed method.
The developed method is based on the model of metric comparison with the etalon and on the calculation of descriptor votes for individual classes, which in the statistical aspect corresponds to the principle of maximum plausibility.
Thus, by implementing the index structure and hashing in the experiment, a significant gain in the time of classification has been obtained over 10 times. The main limitation of implementing the hashing is considered to be some increase in the required amount of the computer memory [5], [9], [10], [12], [13], [14].
In our article [33] we have described the results of the experiments on the use of the hashing apparatus for the classification of images of the different breeds of dogs using an Oriented FAST and Rotated BRIEF (ORB) detector.
The research [33] has shown that the classification time decreases with the increase of the number of the baskets because the number of descriptors inside the baskets decreases, and after determining the basket, the Hamming distance is already calculated for the smaller number of descriptors. The search for the basket by the value of the hash function is much faster than calculating the metric for the set of descriptors inside the baskets.
Here we have used the hashing on 256 baskets in the traditional sense without additional data processing, which reduces the computational costs compared to this method. The research for three etalons of 500 descriptors in their descriptions has shown again in the classification speed of 65 times for the hashing method compared to the classical linear search [33]. The precise results of the classification of the etalons have been obtained for the wide range of values of the number of baskets. These results have confirmed the feasibility and effectiveness of hashing implementation for the procedures of structural image recognition.
In the proposed research, to reduce the volume of metric search for the task of the image classification using the structural methods, a combination of tools has been used: hashing, partitioning (design), LSH-processing as hash function sorting, logical analysis (coordinated filtering) of values hash functions for the blocks.
The advantages of the proposed method are the linear search only in the limited and parametrically controlled band, reduction of data dimension, adaptation to the peculiarities of the applied classification task, and considering the deviation of similarity values within certain limits. Introducing the additional logical analysis of the etalon data at the previous stage of processing will further reduce the processing time during classification.
The peculiarity of the classifier is that at each step we obtain not the exact search for the relevant descriptor of the description, but considering the tolerance. This improves noise immunity.

VI. CONCLUSION
The considered methods of classification are based on the principle of ''comparison with the etalon'' and can be applied to the arbitrary data vectors of bit type.
When applying the hash structures, the key point is to choose the effective and data-adapted hash function that performs partial classification without changing the data. The deeper analysis of the values of the hash function for the specific etalons allows to further reduce the necessary computational costs.
The scientific novelty of the research is the development of the productive method of image classification based on the implementation of high-speed search using the chain method of processing indexed hash data structures that reduces the required amount of computation by tens of times.
The practical significance of the work is to build the classification models in the transformed data space, confirm the efficiency of the proposed modifications on the examples of images, and create software applications for the implementation of the developed classification methods for the computer vision systems.
Prospects for research can be related to the introduction of the logical processing of data distributions within the hash structure, the study of noise immunity of the developed methods, and the evaluation of their applied performance in the multidimensional image sets.
YOUSEF IBRAHIM DARADKEH received the Doctor of Engineering Sciences (Ph.D. and P.Eng.) degrees in computer engineering and information technology (computer systems engineering and computer software engineering).
He was a dynamic academician having more than 15 years of experience in teaching and scientific research development and administration experience. He was a Postdoctoral Research Fellow with the Department of Electrical and Computer Engineering, University of Calgary, Canada. He has taught wide spectrum of computer science, computer engineering and networks, and computer software engineering courses (undergraduate and graduate degrees). He has an excellent experience in designing courses that bridge the gap between academia and industry and follow the accreditation requirements. He is currently an Associate Professor with the Department of Computer Engineering and Networks, College of Engineering-Wadi Addawasir, Prince Sattam bin Abdulaziz University, Saudi Arabia. He is a Senior Scientific Researcher and an Assistant Dean for the Administrative Affairs. He is also a well-known and respected scientist internationally. He has published over 90 high-quality refereed research articles in the international journals and conferences. He has also published two books, one chapter, and edited book in the most prestigious publications. His international recognition of scientific achievements is demonstrated by numerous invitations to participate with the program committees of the international conferences and foreign journals and lecturing with renowned scientific centers around the world. He has a membership of the International Academy of Science and Engineering for Development (IASED).
VOLODYMYR GOROKHOVATSKYI received the M.Sc. degree in applied mathematics and engineering from the Kharkiv National University of Radio Electronics (KNURE), the Ph.D. degree in management (technical systems), in 1984, and the Dr.Sc. degree in systems and tool of artificial intelligence, in 2010.
He received an Internship from Dresden Technical University. He is currently working as a Professor with the Department of Informatics, KNURE. He has more than 270 scientific articles and seven monographs. His research interests include the image and pattern recognition in computer vision systems, structural methods of image classification and recognition, multidimensional data analysis, and artificial intelligence.
IRYNA TVOROSHENKO received the Ph.D. degree in artificial intelligence systems and means, in 2010.
She is currently an Associate Professor with the Department of Informatics, Kharkiv National University of Radio Electronics. She has published 161 scientific articles and educational and methodical articles, including four study guides, seven monographs, 38 articles, 85 abstracts of reports, 14 lecture notes, and 13 methodological instructive regulations. She is fluent in modern programming languages and technologies, computer-aided mathematical modeling, and constantly expanding her range of scientific interests. Her research interests include image and pattern recognition in computer vision systems, structural methods of image classification and recognition, and fuzzy methods in artificial intelligence appliance.
MEDIEN ZEGHID received the Ph.D. degree in information and communication, sciences and technologies from the University of South Brittany, Lorient, France, in 2011. He is currently an Assistant Professor with the Department of Computer Engineering and Networks, Prince Sattam bin Abdulaziz University. His research interests include information security, architectural synthesis for the crypto-systems, image and video coding, and optical communication.