A Novel Ensemble Beta-Scale Invariant Map Algorithm

This research presents a novel topology preserving map (TPM) called Weighted Voting Supervision -Beta-Scale Invariant Map (WeVoS-Beta-SIM), based on the application of the Weighted Voting Supervision (WeVoS) meta-algorithm to a novel family of learning rules called Beta-Scale Invariant Map (Beta-SIM). The aim of the novel TPM presented is to improve the original models (SIM and Beta-SIM) in terms of stability and topology preservation and at the same time to preserve their original features, especially in the case of radial datasets, where they all are designed to perform their best. These scale invariant TPM have been proved with very satisfactory results in previous researches. This is done by generating accurate topology maps in an effectively and efficiently way. WeVoS meta-algorithm is based on the training of an ensemble of networks and the combination of them to obtain a single one that includes the best features of each one of the networks in the ensemble. WeVoS-Beta-SIM is thoroughly analyzed and successfully demonstrated in this study over 14 diverse real benchmark datasets with diverse number of samples and features, using three different well-known quality measures. In order to present a complete study of its capabilities, results are compared with other topology preserving models such as Self Organizing Maps, Scale Invariant Map, Maximum Likelihood Hebbian Learning-SIM, Visualization Induced SOM, Growing Neural Gas and Beta- Scale Invariant Map. The results obtained confirm that the novel algorithm improves the quality of the single Beta-SIM algorithm in terms of topology preservation and stability without losing performance (where this algorithm has proved to overcome other well-known algorithms). This improvement is more remarkable when complexity of the datasets increases, in terms of number of features and samples and especially in the case of radial datasets improving the Topographic Error.


I. INTRODUCTION
The extraction of information from enormous datasets that are generated by modern experimental and observational methods is increasingly necessary in almost all industrial and scientific fields and business operations nowadays. This ''information extraction'' [1] is defined as the nontrivial data mining [2], [3] of implicit, previously unknown, and potentially useful information. Among several fields where ''information extraction'' is not an easy task, Big data [4]- [8] is one of the most recent and important topics where the use of intelligent techniques becomes crucial to be able to extract knowledge from the enormous amounts of information. One The associate editor coordinating the review of this manuscript and approving it for publication was Aysegul Ucar . of the many techniques used to extract relevant information is data visualization [9]- [13].
A recent advance in this field is the Beta-Scale Invariant Map (Beta-SIM) [14], which is based on a modification of a topology-preserving map that can be used for scale invariant classification [11], [14], [15, p. 2], [16], [17], by deriving new learning rules from Beta distribution and applying it to the SIM [11], [17].
Another widely used clustering and classification algorithm is the Growing Neural Gas (GNG) algorithm, proposed by Fritzke [18], [19]. It is based on the Neural Gas (NG) algorithm previously proposed by Martinetz et al. [20] for finding optimal data representations based on feature vectors, which is in turn a modification of the widely known SOM. The main characteristic of the NG algorithm is that instead of VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ expanding through the data input space as a fixed grid of units (as done by the SOM algorithm), the NG algorithm allows the neighbouring relationships of its units to change, expanding more like a gas over the data space.
In the other hand, one of the main problems related to Artificial Neural Networks (ANN) is the fact that they can be rather instable obtaining different results in spite of being trained with the same dataset and similar parameters [21]. In this kind of algorithms, samples of datasets are presented in a random order and algorithms are also randomly initialized to avoid to benefit the same samples in different runs of the algorithm. Therefore, the final results after several runs of training process can be rather different, in spite of an statistical analysis of the results could show that algorithms are stable [11]. Ensembles are widely used in the ANN and have proved to be useful for Topology Preserving Maps to increase the stability of these algorithms [22]. Ensembles [16], [23], [24] are based on the concept that a group of experts will get a better results for a problem than a single expert. In this study, it is combined the WeVoS meta-algorithm [25] with other recent topology preserving map (Beta-SIM) [14], to get the novel WeVoS-Beta-SIM. The aim is to improve stability of the original algorithm (Beta-SIM) getting a better topology preserving representation and, therefore, without losing performance of Beta-SIM algorithm which provides a better visualization of the internal structure of high dimensional datasets.
Weighted Voting Supervision (WeVoS) [25] combines the final network maps of an ensemble of Topology Preserving Maps in a single one that includes the best features of each network in the ensemble, trying to solve the problem previously described of instability of neural networks.
Therefore, this novel research presents and thoroughly analyses the use of WeVoS meta-algorithm when it is applied to a new family of learning rules called Beta-SIM, giving a novel algorithm called WeVoS-Beta-SIM. As result the new algorithm is able to obtain the same performance of Beta-SIM algorithm (in terms of Classification Error and Mean Quantization Error) and at the same time improve the topology and stability of the generated grids (in terms of Topographic Error). It is also compared with other wellknown Topology Preserving Maps and WeVoS versions such as: the Self Organizing Maps (SOM), WeVoS-SOM, Scale Invariant Map (SIM), Maximum Likelihood Hebbian Learning-SIM (MLHL-SIM), Visualization Induced SOM (ViSOM), Growing Neural Gas (GNG) and Beta-SIM. The present study reports the application of these algorithms to 14 diverse real benchmark datasets from the UCI web repository [26]. These group of 14 datasets are related to industrial, science and economy cases of study, and contains different combinations of number of features, number of samples and number of classes in order to test the novel WeVoS-Beta-SIM algorithm with very diverse datasets and validate its behavior.
This study is organized as follows: In Section 2, presents the main Topology Preserving Maps and detail the well-know SIM algorithm, in Section 3 the novel ensemble WeVoS-Beta-SIM is described in detail. Section 4 outlines the quality measures, previously proposed in the literature, used to evaluate different properties of topology preserving mapping algorithms. Section 5 analyzes the capabilities of the WeVoS-Beta-SIM algorithm by applying it to perform a detailed study over 14 real benchmark datasets with diverse characteristics. Finally, Section 6 contains the conclusions and outlines future lines of research.

II. TOPOLOGY PRESERVING ALGORITHMS
Among the great variety of tools for multidimensional data visualization, several of the most widely used are those belonging to the family of the Topology Preserving Maps [27]- [29]. Probably the best known among these algorithms is the Self-Organizing Map (SOM) [27], [30]- [32], based on a type of unsupervised learning called competitive learning. Other types are related to the SIM algorithm [11], [33], [34] which differs from the SOM as it is designed to perform their best with radial datasets, due to the fact that both create a mapping where each neuron captures a ''pie slice'' of the data according to the angular distribution of the input data.

A. DIFFERENT TYPES OF TP ALGORITHM
Several extensions of SOM can be found in the literature such as the Generative Topographic Mapping (GTM) [35]- [37] which was developed as a probabilistic version of the SOM, in order to overcome some of its limitations, particularly the lack of an objective function. An important potential application of the GTM is allowing a simpler visualization of high-dimensional data.
Other extensions of SOM are the Topographic Product of Experts (ToPoE) [38], and the Harmonic Topographic Map (HaToM) [38], [39], where the topology preserving map is created from a product of experts.
The Visualization Induced SOM (ViSOM) [16], [40], [41], is a SOM extension proposed for the direct preservation of the local distance information on the map, along with the topology. The ViSOM constrains the lateral contraction forces between units and hence regularizes the inter-unit distances, so that distances between units in the data space are in proportion to those in the input space. The ViSOM does not only takes into account the distance between a unit's weights from one iteration to the next, but also the distance between that unit and the Best Matching Unit within the whole map (BMU). This allows the ViSOM to preserve topology by maintaining distance between neighbours of the winner unit.
Two other interesting topology preserving models are the Scale Invariant Map (SIM) [9], [11], [17] and the Maximum Likelihood Scale Invariant Map (MLHL-SIM) [17], [42]. Both are designed to perform their best with radial datasets, as each mapping neuron catch up only a radial portion of the data based on its angular distribution. However, when SOM is trained, it approximates a Voronoi tessellation of the input space [27]. The Scale Invariant Map is an implementation of the negative feedback network [43] to form a topology preserving mapping. The main difference between this mapping and the SOM [27], [31] is that this mapping is scale invariant.

B. SCALE INVARIANT MAP
The main target of the family of Topology Preserving Maps [27] is to produce low dimensional representations of high dimensional datasets maintaining the topological features of the input space.
The SIM [9], [11], [34] is an algorithm similar to SOM [27], but training uses a method based on negative feedback network [43], [44]. SIM uses a neighbourhood function and competitive learning in the same way as a SOM. SIM model is defined by (1), (2) and (3). Weightsupdate: where, x is a N-dimensional input vector, and y an M-dimensional output vector, with W ij being the weight linking input j to output i; e is the residual or error, η the learning rate, W c is the weight connected to the output winner and h ci represents the neighborhood function, which is a Gaussian function in this case. The input data x j is feed forward through weights W ij to the output neurons y i , where a linear summation is performed to obtain the activation of the output neurons (1).
Based on the previous obtained neurons activation, a winner neuron is selected using the minimum Euclidean distance (the neuron whose weight vector is closest to the input neuron wins) or using the maximum activation (the neuron with the highest activation wins).
After selection of an output winner, the winner, c, is deemed to be firing (y c = 1) and all other outputs are suppressed (y i = 0, ∀i = c).
The winner's activation is then feedback through its weights and this is subtracted from the inputs, and simple Hebbian learning is used to update the weights of all nodes in the neighborhood of the winner.

III. ENSEMBLES
Ensembles are meta-algorithm used to improve the performance of algorithms mainly used in supervised learning [45]- [47]. An ensemble can be view as a group of experts working together to solve a problem. Therefore, a dataset is divided into several parts and one model is generated by using one of these parts. Finally, the ensemble meta-algorithm combines resulting models to predict a result.
The main strength of ensembles is that are able to get a good balance between small variance and small bias. The main reason is that different classifier designs potentially offer complementary information on the patterns to be classified and could be harnessed to improve the performance of the selected classifier.
There are two main approaches [11], [16], [45] to how each of the classifiers or components of the ensemble are going to be trained: • Independent Training: Each model is trained without knowledge about other models, being Bagging the most widely used technique [47], [55].
• Coordinated trained: The training of one model takes into account how other models were trained. This concept is applied by Boosting techniques [47], [55]. Currently, the use of ensembles for not supervised learning is not strongly developed, being the main reference, the ensembles used in topology-preserving maps. In all cases a fusion process of the generated maps is performed based on some metrics such as Fusion by Euclidean Distance, Similarity of Voronoi Polygons, etc., however they do not take into account the topology neighborhood. In order to avoid such problem, in the recent years several fusion methods were developed for topology-preserving map fusion [11], [48]- [51].
One of the key points is that the weights of each network in the ensemble are initialized to be able to neurons in the same position of two networks are comparable. Therefore, maps should be as similar as possible. Thus, all maps are trained using the same parameters and each ensemble model is initialized with the same values as the previous model finished its training [11], [16].

IV. WeVoS-Beta-SIM
This section presents a novel ensemble algorithm based on Beta-SIM TPM family, devoted to improve the stability of the members of this family, by means of a weighted voting system to fusion of neurons of the different Beta-SIM maps.
Beta-SIM [14] is a novel version of SIM [17], based on the application of a family of learning rules called Beta Hebbian Learning (BHL) [52], when they are applied to SIM.
The main difference between Beta-SIM and SIM is that in Beta-SIM the BHL is used to update the weights of all nodes in the neighbourhood of the winner. Then, Beta-SIM can be defined by means of (4), (5) and (6). Where (6) is obtained by applying BHL method to SIM to update the weights.
Weights update: where, x is an N-dimensional input vector, and y an M-dimensional output vector, with W ij being the weight linking input j to output i; e is the residual or error, η the learning rate, W c refers to the weights of the winning neuron, h ci represents the neighborhood function, which is a VOLUME 8, 2020 Gaussian function in this case, and α and β are the parameters that determine the shape of the PDF curve Therefore, by maximizing the likelihood of the residual with respect to the actual distribution, it is matched the learning rule to the PDF of the residual (e).
The Beta-SIM algorithm is only stable when the absolute value of the residuals (|e|) is lower than 1 [14], as when values of the residuals are beyond this limit, the value of the weights update tends towards infinity. To avoid the possibility that the residuals have values higher than 1, the datasets should be normalized in order to satisfy this limitation and preserve the internal topology between dataset dimensions.
The capability of Beta-SIM algorithm to adapt to sparse clusters or to neglect them, based on combinations of parameters α and β [14], provides to the units of Beta-SIM network more freedom than other Topology Preserving Maps to adapt to datasets, however, it also potentially adds instability to the training. So, the use of ensembles and specifically WeVoS fusion algorithm [15], seems to be the most appropriated method to correct this effect.
WeVoS [25] is a weight voting ensemble system to generate fusion single TPMs. Fusion of such TPMs, generate a final TPM that reduce the complexity and increase the accuracy with respect to single maps without using ensemble techniques. It has been previously successfully applied to other well-known Topology Preserving Maps [11], [16], [25]. Beta-SIM has proved to overcome such topology preserving algorithms in the main aspects aimed for Topology Preserving Maps such Mean Quantization Error (MQE), Classification Error (CE) and Topographic Error (TE) [14].
Then, in this research it is presented a novel algorithm based on the application of WeVoS ensemble to the Beta-SIM, presenting a detailed study and comparison with other well-known Topology Preserving Maps and their ensemble versions, such as WeVoS-SOM, to analyse its impact on aspects such as the stability and topology preservation conditions.
WeVoS-Beta-SIM obtains a final map as combination of different Beta-SIM maps by fusion of the neurons in the same position based on a weighted voting. (7) is applied for this voting process: where, V p,m is the weight of the vote for the unit included in map m of the ensemble, in its position p, M is the total number of Beta-SIM maps, b p,m is the binary vector used for marking the dataset entries recognized by unit in position p of map m, and q p,m is the value of the desired quality measure for unit in position p of map m. b is a binary vector of the same length as data samples are in the dataset. It is used to store the samples recognized by a single unit.
Fusion of neurons of the different Beta-SIM maps for WeVoS-Beta-SIM, during the training process, is done based on a quality measure [11] calculated for each Beta-SIM map.
This quality measure is considered during the fusion process where the weights of each neuron is proportional to the value of such quality measure, as to modify the position of the neuron in the fused map, the weights of each of the neurons in that position are fed to the final map (nodes N1, N2, N3 in Fig. 1). In the ensembles, neurons in the same position of different networks are fused, so their weights should be similar in order to be comparable. Therefore, maps should be as similar as possible. Thus, all maps are trained using the same parameters and each ensemble model is initialized with the same values as the previous model finished its training [11], [16].
Briefly, WeVoS-Beta-SIM meta-algorithm works in the following way: • First of all, an ensemble of Beta-SIM maps is trained. • Then, the chosen quality/error measure is calculated for each neuron in all Beta-SIM maps.
• The fused map is initialized by calculating the centroids of the neurons in the same position of all the maps, by calculating the superposition of the ensemble.
• For each of the neurons in the fused map, the average neuron quality and the number of total samples recognized in that position for the Beta-SIM maps, are calculated.
• The weight of the vote for each neuron can be calculated with this information by using (7).
• To modify the position of the neuron in the fused map, the weights of each of the neurons in that position are fed to the final map.
• Finally, the learning rate in each case will be the weight of the vote for that neuron.

V. QUALITY AND TOPOLOGY MEASURES
Despite there are several quality measures that are used to measure the capabilities of the Topology Preserving Maps, there is not a global one [11] being some of them complementary by each one and assessing different features of the final maps in different visual representation areas. Among all these quality measures the three ones selected in this research, due to their complementarities, are the following: Classification Error (CE): Using its inherent pattern matching characteristics, the Topology Preserving Maps in general terms can be used for classification tasks. Intuitively, the samples activating the same neuron of the network are very likely to belong to the same class. When a new sample is presented to the network, the sample can be classified in the same class as the majority of samples activating the same neuron belong to. A consistent behavior when classifying samples points to a correctly trained map. Although this is not the main function of this kind of networks, the measure of how many samples are wrongly classified has been used, to an extent, to assess the quality of the final map in numerous previous studies [11].
Mean Quantization Error (MQE): MQE is related to all forms of vector quantization and clustering algorithms. Thus, this measure completely disregards map topology and alignment. MQE is computed by determining the average distance of the dataset entries to the cluster centroids by which they are represented. In case of SOM, the cluster centroids are the characteristic vectors.
Topographic Error (TE): TE is the simplest of the topology preservation measures. A dataset is also needed to calculate this measure. For all data samples, the respective best and second-BMUs (1 st BMU and 2 nd BMU) are determined. If these BMUs are not adjacent on the map lattice, it is considered an error. Finally, the total error is normalized to a range from 0 to 1, where 0 means perfect topology preservation.

VI. EXPERIMENTS AND RESULTS
Several experiments have been designed and performed to investigate the capabilities of WeVoS-Beta-SIM and also to compare it with other well-known Topology Preserving Maps such as of SIM, Beta-SIM, SOM, WeVoS-SOM, ViSOM, MLHL-SIM and GNG.
The first type of experiments was designed to present visually some of the main characteristics of the used algorithms such as topology of the grid maps and spread of the grid maps over the data. In the second type, a thoroughly analysis, in terms of the three quality measures used in this research (CE, MQE and TE), was performed to validate the visual results obtained previously.
All the tests were run using a classic ten-fold crossvalidation to use the complete dataset for training and testing. The ensembles were trained using one of the simplest meta-algorithms for ensemble training: the bagging metaalgorithm [53].
In the case of WeVoS-Beta-SIM and WeVoS-SOM, the datasets have been reduced to 1/5 of its original size, and a single model and an ensemble of 5 maps are calculated for each one, comparing the performance of the models over datasets with the same inner structure.

A. BENCHMARK DATASETS DESCRIPTION
A total of 14 diverse interesting benchmark datasets related to industry, economy and science cases of study were used to validate the performance of the WeVoS-Beta-SIM algorithm. Datasets were taken from the UCI Machine Repository [26] presenting different characteristics, such as number of samples, features and classes, from low to high dimensional datasets.
In Table 1, a summary of these diverse datasets is presented in terms of sample size, features and number of classes. Different map sizes, and combination of algorithms parameters where tested, presenting the best combination.
Different map sizes have been tested (20 × 20, 20 × 25 and 30 × 30), however the final maps are similar and improvements in CE; MQE and TE are not significant, therefore the best combination of parameters for each algorithm for each experiment are presented in Table 2 (APPENDIX). WeVoS-Beta-SIM and WeVoS-SOM use the same parameters as their respective original models, Beta-SIM and SOM.

B. VISUALIZATION RESULTS
In this subsection, the visualization results obtained for 2 of the previously described benchmark datasets are presented (''Iris'' dataset and ''Landsat Satellite'' dataset). The aim is to present visually some of the main characteristics of the used algorithms such as topology of the grid maps and spread of the grid maps over the data. Then, a complete analysis of the results, in terms of three quality measures (see Section 4), is performed over all benchmark datasets.
The 2 datasets used in this subsection were selected due to their different levels of complexity. The first dataset is the well-known ''Iris'' dataset (with low complexity; only 3 classes and 345 samples, 6 features) and the second one is the ''Landsat Satellite'' dataset (with high complexity; 7 classes and 6435 samples, 36 features). The graphs presented in Fig. 2, 3, 4 and 5 illustrate the performance of each model for each dataset, presenting the analytical results in the next subsection (Section ''5.3 Analytical results''). Fig. 2 and 4 represent the adaptation of each map to its structure in representation of the dataset under analysis (''Iris'' and ''Landsat Satellite'' datasets). It depicts the lattices composing the maps embedded in a 2D input space. Fig. 2 and Fig. 4 present the datasets projected onto their first three principal components and the final grid maps of the models are also embedded in the space of the three principal components. This approach has been previously satisfactory applied [11], [15], [40], [41], [54] in order to visually support the analytical results presented in Tables 3-8. Fig. 3 and 5 show the final unit map for each algorithm where only BMUs are displayed. Each BMU of each map is labelled based on the training inputs to which they are reacting. This means that if one neuron (BMU) is activated by 20 inputs samples and 19 of them belongs to class 1, this neuron is labelled as class 1 (red circles in Fig. 4 and 5) [11]. In spite of in boundaries the BMU could easily belong to one or other class, it only happens in the limits of class boundaries and only a few neurons could be misclassified.
GNG is not suitable for this 2D map representation, as some units are disregarded from the final model and therefore the topology preservation is lost.

1) VISUALIZATION RESULTS FOR IRIS DATASET
It is easily observed in Fig. 2 that Beta-SIM and WeVoS-Beta-SIM grid maps are more widely spread throughout the Iris dataset (represented as magenta dots), covering the input space better than the other algorithms. This better coverage over the dataset corresponds to a better MQE result (see Table 7 APPENDIX).
However, SOM, WeVoS-SOM and ViSOM conserve the topology of the map very well as their grid maps contains just few twists and folds (Fig. 2d, 2e and 2g). Therefore, they have the lowest TE values among all algorithms (see Table 8 APPENDIX).

Comparing
Beta-SIM and WeVoS-Beta-SIM, Fig. 2b and 2c show how WeVoS-Beta-SIM obtains a better topology of the grid map versus Beta-SIM (which means better TE results), due to the fact that WeVoS-Beta-SIM grid map contains less twists and folds than Beta-SIM. At the same time, the WeVoS-Beta-SIM grid map adapts slightly better their structure to the dataset, covering the Iris dataset more adequately (which means a lower MQE value). Fig. 3 shows that, in general, the WeVoS-Beta-SIM algorithm (Fig. 3c) provides the map with more compact and clearly separated groups. However, differences with the other algorithms are minor. All algorithms obtain maps where class 1 (red circles Fig. 3) is clearly separated from the other 2 classes (class 2 -blue squares-and class 3 -green triangles-). Differences between maps are only appreciable when they are compared in terms of separation of classes 2 and 3, where WeVoS-Beta-SIM algorithm presents these 2 classes in more compact and clearly defined groups (Fig. 3c). Fig. 4 shows how Beta-SIM (Fig. 4b), GNG (Fig. 4h) and WeVoS-Beta-SIM (Fig. 4c) algorithms outperform the other algorithms by distributing the units of their grid maps over the Landsat Satellite dataset (represented as red dots) in the best possible way. Units of the grid maps are close to the input samples over the whole dataset, which leads to lower MQE values than the other algorithms.

2) VISUALIZATION RESULTS FOR LANDSAT SATELLITE DATASET
A similar situation occurs with SIM and MLHL-SIM algorithms ( Fig. 4a and 4f) as they obtained grid maps which adapt well their structure to the dataset but not as well as the previously mentioned algorithms.
Again, SOM, WeVoS-SOM and ViSOM algorithms are the ones which better preserve the topology of the grid maps (Fig. 4d, 4e and 4g) as their maps have less twist and folds than the other algorithms. Therefore, they obtain the lowest TE values (see Table 8 APPENDIX).
WeVos-Beta-SIM (Fig. 4c) and Beta-SIM (Fig. 4b) algorithms obtain similar final grid maps, but WeVoS-Beta-SIM preserves the topology of the grid map better than the Beta-SIM algorithm. Comparing both figures (Fig. 4b and 4c), the WeVoS-Beta-SIM grid map presents less twists and folds than the Beta-SIM grid map, having then lower TE values.
It can be seen in Fig. 5 that WeVoS-Beta-SIM (Fig. 5c) provides the best visual representation through a smoother map compared to the rest of the algorithms. The map presents, in general, compact and unmixed groups (there is no mixing of BMUs from different classes). Beta-SIM also obtains compact groups (Fig. 5b), but some of the neurons associated to different classes are mixed. This fact was observed in Fig. 4 where the Beta-SIM grid map (Fig. 4b) contained more twists and folds than WeVoS-Beta-SIM (Fig. 4c).
In the case of SOM and WeVoS-SOM ( Fig. 5d and 5e), both algorithms produced maps where groups are clearly defined, however some groups present mixed classes. For instance, in the top of both maps (Fig. 5d and 5e), class 2 (blue squares) is divided into two groups separated by a group of class 1 (red circles). In the case of classes 2 and 6 (blues squares and magenta asterisks respectively), in both maps several neurons appear lost in the middle of the map.

3) CONCLUSIONS OF THE VISUALIZATION RESULTS
The results suggest that WeVoS-Beta-SIM provides a better visual representation of the datasets than the other algorithms, as it is able to widely spread its grid map covering the input space better than the other tested models. At the same time, WeVoS-Beta-SIM obtains grid maps with less twists and folds than the Beta-SIM algorithm, which signifies a better topology of the map.
The improvement on visual representation, achieved by WeVoS-Beta-SIM, is notably higher when the complexity of the datasets increases. Using the previous examples, differences between maps were minor for the Iris dataset (low complexity), whereas differences were higher for the Landsat Satellite dataset (high complexity).

C. ANALYTICAL RESULTS
In order to validate the results obtained, statistical tests for the three quality measures were performed consisting of an ANOVA + post-hoc analyses. The statistical results for CE, MQE and TE are presented in Tables 3 to 5 (p-values) and Tables 6 to 8 (average testing values ± standard deviation).
All measures presented are error measures for the testing dataset, so the desired value is always as close to 0 VOLUME 8, 2020 as possible. The CE is presented in percentage form and normalized between 0 and 1, whereas the rest of the measures are expressed as absolute values.
These series of experiments analyze 2 different aspects of the novel WeVoS-Beta-SIM and the other tested algorithms: • The performance of WeVoS-Beta-SIM in comparison with the other 7 topology preserving models, in terms of CE, MQE and TE quality measures.
• The effect of modifying the number of data samples used during the training process for all algorithms under study. This was done in order to emulate the addition of noise or instability in the datasets [16].

1) ANALYSIS OF RESULTS IN TERMS OF CE
Results presented in Fig. 6 show that GNG, Beta-SIM and WeVoS-Beta-SIM often obtain better results than the other algorithms (SOM, ViSOM, WeVoS-SOM, SIM, MLHL-SIM), in terms of CE values. It should be noted that these differences in CE can only be seen when the complexity of the datasets is high. For low complexity datasets (in this novel research those having less than 5 classes; datasets from D1 to D8), CE results obtained by the different algorithms were not statistically significant (see results of Table 3 APPENDIX).
Results for this experiment confirm the conclusions obtained by the visual representation test (figures from Fig. 2 to 5). For example, in the case of the Iris dataset (a dataset with low complexity), all algorithms presented very similar final maps (Fig. 3), so similar CE values for all algorithms were expected. In the case of the Landsat Satellite dataset (high complexity), the WeVoS-Beta-SIM algorithm obtained a map (Fig. 5c) with more compact groups and less mixed classes, therefore obtaining a better CE value than the other algorithms.
It can also be seen in Fig. 6 that when the complexity of the datasets increases, the WeVoS-Beta-SIM obtains better CE results than the Beta-SIM algorithm. However, when a statistical test of the results is performed (Table 3 APPENDIX) differences between CE values were not statistically significant.
Finally, the change in CE values when the number of samples is increased was analyzed (Fig. 6). The effect of adding such instability in this case is not particularly evident, as the change in CE values did not follow a clear tendency.

2) ANALYSIS OF RESULTS IN TERMS OF MQE
Results for MQE values are presented in Fig. 7, where it is clear that SOM and WeVoS-SOM algorithms obtain the highest MQE values for all datasets. This is expected based on results obtained in the visual representation test (Fig. 2 and 4), where WeVoS-SOM and SOM grid maps do not spread over the datasets as well as the other algorithms.
The rest of the algorithms behave in a similar way, in terms of MQE, where none of which outperform the others when complexity of the dataset is low. However, when complexity of the datasets increases, 2 algorithms stand out over the others in terms of MQE results: WeVoS-Beta-SIM and Beta-SIM algorithms. Again, these results confirm the conclusions obtained by the visual representation tests (Section 5.2), where when the complexity of the datasets is high, the maps of these algorithms cover the input space better than the other algorithms. Fig. 7 shows that the MQE results obtained by WeVoS-Beta-SIM are often better than the simple model Beta-SIM, even when the complexity of the dataset is low. However, these differences are not always statistically significant (see Table 4 APPENDIX).
Finally, the change in MQE values when the number of samples is increased was analyzed. In this case, adding such instability does not present a particular effect on the algorithms, with the MQE being more dependent on the total size of each dataset.

3) ANALYSIS OF RESULTS IN TERMS OF TE
Finally, the TE results are analyzed and presented in Fig. 8. The TE is related with the topology of the final maps, as was shown in the visual representation section (Section 5.2). Based on those results, it was expected that WeVoS-SOM, SOM and ViSOM would obtain the lowest TE, as these algorithms produced maps with less twists and folds. Results of Fig. 8 confirm it, but differences can only be seen when the complexity of the dataset is high.
When complexity of the datasets increases and differences between algorithms are clear (datasets from D8 to D14), in terms of TE values, GNG and Beta-SIM algorithm obtain the worst TE values. In the case of GNG, this is due to the fact that some units of the final map are disregarded from the final model and therefore the topology preservation is lost. In the case of the Beta-SIM algorithm, it focuses on distributing the units of the grid map over the dataset, so the final map contains several twists and folds ( Fig. 2b and 4b).
WeVoS-Beta-SIM (Fig. 8) consistently improves the TE results of the simple model Beta-SIM, especially when the complexity of the datasets increases. This means that the final map provides a better visualization regarding the topology preservation of the map (better TE) whereas keeping similar MQE values. The statistical tests presented in Table 5 (APPENDIX) validate these results when complexity of the datasets is high (as p-values are lower than the significance level of 0.05).
The effect of adding instability by decreasing the number of samples for the training process was analyzed for WeVoS-Beta-SIM and its simple version Beta-SIM. When the complexity of the datasets is high, WeVoS-Beta-SIM presents very similar TE values as the number of samples is increases. However, Beta-SIM algorithm (Fig. 8) obtains unstable TE values (increasing and decreasing without a clear tendency). Therefore, it can be concluded that WeVoS-Beta-SIM, in terms of TE, is less sensitive to noise than the Beta-SIM algorithm. Therefore, WeVoS-Beta-SIM obtains maps with less distortion which in turn provides a better visual representation of the internal dataset structure.

4) CONCLUSION OF THE ANALYTICAL RESULTS
Analytical results of this subsection confirm the conclusions obtained by the visual representation tests (Section 5.2). When complexity of the datasets is high, WeVoS-Beta-SIM algorithm often obtains the best MQE results, VOLUME 8, 2020         VOLUME 8, 2020 which corresponds to a more open and widely spread grid map over the dataset.
WeVoS-Beta-SIM also obtains the best CE results providing final maps with clearly defined groups, again when complexity of the datasets is high.
At the same time, the TE results of WeVoS-Beta-SIM are better than those obtained by the simple model Beta-SIM, providing to the final map a better topology preservation. Therefore, it can be concluded that the WeVoS-Beta-SIM is able to provide the best visual representation of the internal structure of datasets when their complexity is high (i.e. in this research means more than 5 classes).
Finally, in terms of stability, WeVoS-Beta-SIM is less sensitive to noise in terms of TE than the simple model Beta-SIM, which lead to obtain maps with less distortion effect which in turn provides a better visual representation of the internal dataset structure. However, CE and MQE results do not show a clear tendency when instability is added.

VII. CONCLUSIONS AND FUTURE WORK
In this research, a novel topology-preserving model known as WeVoS-Beta-SIM has been presented, analyzed and compared with other well-known topology preserving models. This novel algorithm aims to obtain the best topology preserving summary as possible in order to improve the visual representation of high dimensional datasets and to increase the stability of the original model (Beta-SIM).
Therefore, the use of ensemble WeVoS when applying to the Beta-SIM algorithm improves the visual representation of the internal structure of high complex datasets (in this research means more than 5 classes), generating grid maps widely spread and that covers the input space better than the other models (better MQE values). At the same time, WeVoS-Beta-SIM obtains maps with less twists and folds than the simple model Beta-SIM algorithm (better TE values), which signifies a better topology of the map.
As can be seen in the results, the improvement on visual representation, created by WeVoS-Beta-SIM, is notably higher when complexity of the datasets increases. With very simple datasets, it only makes slight improvements or can even obtain worse results. That said, its usefulness has been proven in the case of more complex datasets (more than 5 classes), where the extra complexity of the calculation of the ensemble leads to the considerable increase of performance, obtaining a better organization and visualization of the presented information.
Results also show that WeVoS-Beta-SIM is less sensitive to noise in terms of TE than the simple model Beta-SIM. It leads to maps with less distortion which in turn provides a better visual representation of the internal dataset structure.
All the previous improvements made WeVoS-Beta-SIM a powerful new tool for the data mining community and should take its place among existing Topology Preserving Maps.
Future work will be focused on using WeVoS-Beta-SIM to analyze challenging real datasets to solve problems in the fields of big data, electric vehicles, energy efficiency, cybersecurity, etc. Also, the use and comparison with other ensembles will be explored for all algorithms tested in this research.

APPENDIX
See Tables 2-8. HÉCTOR QUINTIÁN (Member, IEEE) received the M.S. degree in computer science from the University of A Coruña, Spain, in 2010, and the Ph.D. degree from the University of Salamanca in 2017.
He is an Assistant Professor with the University of A Coruña. He has published over 30 peer-reviewed articles, 21 conference papers, and 18 conference editorials. He has patented several software models. He has been the coorganizer, the program committee chair, and the session chair of several international conferences. His research interests include neural networks, with a particular focus on exploratory projection pursuit, maximum likelihood Hebbian learning, self-organizing maps, multiple classifier and its applications to control engineering, optimization, and education.
Dr. Quintián is the Vice-Chair of the IEEE-SMC Spanish Chapter. He has actively contributed to several current projects in the EU, including IT4Innovation, ICT Action COST IC1303, and IntelliCIS, and several national and regional projects.
EMILIO CORCHADO (Member, IEEE) received the Ph.D. degree in computer science from the University of Salamanca, Spain.
He is an RTD Expert hired by international organizations, such as the European Commission, the Grant Agency of Czech Republic, and the Spanish National Agency for Assessment and Forecasting. He has collaborated with SMEs and new companies in the innovation field in about 40 projects. He has patented software models, and he owns the IP of more than ten ICT tools and models. He is currently a Full Professor in computer and automatic science with the University of Salamanca. He has published over 100 peer-reviewed articles in a range of topics from knowledge management and risk analysis, intrusion detection systems, food industry, artificial vision, and the modeling of industrial processes. His research interests include neural networks, with a particular focus on exploratory projection pursuit, maximum likelihood Hebbian learning, self-organizing maps, multiple classifier systems, and hybrid systems.
Dr. Corchado has been the Organizing Chair, the Program Committee Chair, the Session Chair, and the General Chair of a number of conferences, such as the International Conference on Hybrid Artificial Intelligence Systems (HAIS), the International Conference on Intelligent Data Engineering and Automated Learning (IDEAL), and the International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES). He was the Chair of the IEEE Spanish Section, from 2014 to 2015, and he has actively contributed to several current projects in the EU, including SOFTCOMP, IT4Innovation, ICT Action COST IC1303, and IntelliCIS NISIS.