A Quick Algorithm to Determine 2-Optimality Consensus for Collectives

Nowadays, to solve a problem, people/systems typically use knowledge from different sources. A binary vector is a useful structure to represent knowledge states, and determining the consensus for a binary vector collective is helpful in many areas. However, determining a consensus that satisfies postulate 2-Optimality is an NP-hard problem; therefore, many heuristic algorithms have been proposed. The basic heuristic algorithm is the fastest in the literature, and most widely used to solve this problem. The computational complexity of the basic heuristic algorithm is <inline-formula> <tex-math notation="LaTeX">$O(m^{2}n)$ </tex-math></inline-formula>. In this study, we propose a quick algorithm (called QADC) to determine the 2-Optimality consensus. The QADC algorithm is developed based on a new approach for calculating the distances from a candidate consensus to the collective members. The computational complexity of the QADC algorithm has been reduced to <inline-formula> <tex-math notation="LaTeX">$O(mn)$ </tex-math></inline-formula>, and the consensus quality of QADC algorithm and the basic heuristic algorithm is the same.


I. INTRODUCTION
Using knowledge from different sources for decision-making is getting popular [1]. For example, to decide on a problem in our life, people typically search for information on the Internet or ask the opinions of experts. In this way, they generally reach a suitable solution. Another example is a distributed detection system where agents share their logical options on events and make decisions of the system afterwards [2].
In general, a set of knowledge states from different sources is considered as a collective or profile. A collective consists of knowledge states of different agents, experts, or individuals referring to the same problem [3], [4]. Knowledge states of a collective are frequently in conflict [5]. For example, many meteorological stations forecast the weather for the same region. The forecasts from these meteorological stations are different. A collective consisting of knowledge states in conflict is called a conflict collective or conflict profile [6]. Besides, in a collective, the uncertainty of knowledge states often exists [7]. Thus, integrating knowledge states of a The associate editor coordinating the review of this manuscript and approving it for publication was Ching-Ter Chang . collective into one consistent state, which is termed as a collective knowledge or consensus, is a complicated task.
Uncertainty is generally a situation that involves unknown or imperfect information [8], and it is an attribute of information [9]. To treat data with uncertainty, many theories have been introduced, such as rough set theory [10], soft set theory [11], fuzzy set theory [12].
A conflict takes place when at least two bodies have different options on the same subject [6]. Pawlak introduced the first formal model for conflict analysis. In this model, a set of tools for conflict analysis was presented [13]. This model is straightforward, and it does not allow the agents to express complex opinions [6]. Pawlak's model was enhanced by Skowron et al. [14]. This enhanced model defines local states of agents and examines various levels of conflicts. The attribute values in this model are atomic.
In [15], Nguyen et al. presented a general model for conflict and knowledge inconsistency. In this model, such factors as conflict representation and consistency measures for conflict collective are considered. Attribute values are multivalued. To measure the consistency, eight postulates are proposed, and five consistency functions are defined. Besides, a methodology for using inconsistency of knowledge in a collective to determine its consensus is presented [5]. Determining consensus plays an essential role in many research areas [4], [16]. Consensus methods have been proposed for consensus determination [17].
In consensus methods, many postulates are used to define consensus functions in which the postulates 1-Optimality (or Kemeny median) and 2-Optimality are essential postulates [6]. However, no consensus function concurrently satisfies both the postulate 1-Optimality and the postulate 2-Optimality. The consensus satisfying the postulate 1-Optimality or the postulate 2-Optimality is the best representative of the collective. Besides, the consensus satisfying the postulate 2-Optimality is more uniform than that satisfying the postulate 1-Optimality. Although the criterion 2-Optimality is not well-known, in many cases, it is better than the criterion 1-Optimality [3], [6]. Determining the 2-Optimality consensus is an NP-hard problem [6], and heuristic algorithms have been developed for different data structures, such as a complex tree, partition, ontology, and binary vector [6], [18], [19].
We consider the following situation. In this situation, n experts are asked about their opinions for a given problem. Each of them gives answers to m different questions. They can choose one from two possible answers: ''Yes'' or ''No'' (''Yes'' is corresponding 1, and ''No'' is corresponding 0). The collective, which consists of opinions of experts, is represented by n binary vectors of the length equal to m. The second example refers to disease-disease relationships. A group of n patients has hepatocellular carcinoma. Each patient is expressed as a vector with m attributes, and each attribute represents a specific disease. If a patient has a specific disease, the value of the corresponding component is 1, otherwise 0. The collective, which consists of diseases of patients, is also stored as n binary vectors of the length equal to m. The problem of determining the consensuses of these collectives is very useful in our lives. Therefore, in this study, we consider determining the consensus of such situations.
Heuristic algorithms have been introduced to determine the consensus for binary vector collectives [19], [30], [31]. The basic heuristic algorithm is the fastest and best algorithm, and it is the most widely used to solve this task. The computational complexity of this algorithm is O(m 2 n) where m is the length of members and n is the size of collectives [19].
Increasing the number of knowledge sources in the world and rapid development in information technology have facilitated the use of these sources for finding solutions to different problems; as a result, the collective size is increasing [32]. Determining the 2-Optimality consensus of a binary vector collective with a large size may require more time [4]. In this scenario, there is a need to develop a quick algorithm to determine the 2-Optimality consensus of such a collective.

A. MOTIVATION
The motivation for this study is to find a new algorithm with lower computational complexity, whereas its consensus quality is higher than or the same as the consensus quality of the basic heuristic algorithm.

B. CONTRIBUTIONS OF THIS STUDY
The main contributions of this study are as follows: 1) In this study, we proposed a new method to calculate the sum of squared distances from a candidate consensus to members of the collective. The new method reduces the computational complexity of calculating this sum. Based on the advanced calculating method, we propose a quick algorithm whose computational complexity O(mn). 2) We prove that the consensus quality of the QADC algorithm and that of the basic heuristic algorithm is the same both in theory and in the experiment. The remainder of this paper is organized as follows. Section II provides a short review of the consensus problem and the 2-Optimality consensus. In section III, notions are introduced. Section IV presents a new approach to calculate the distance from a candidate consensus to collective members and the QADC algorithm. In section V, we measure the consensus quality of the QADC algorithm. Besides, we compare the QADC algorithm and the best algorithm of previous studies at both two aspects: running time and consensus quality. Finally, conclusions and future directions are provided in VI.

II. RELATED WORK
In computer science, the consensus problem has a long history [33], [34]. During recent years, it has become an attractive research area of interest [35]- [37]. The consensus problem forms the foundation of the field of distributed computing [34], in which all the agents communicate and update their states to achieve an agreement in the network. The consensus concept was introduced to control a community for describing the collective behavior of a group of systems, called multi-agent systems [38]- [40]. The consensus problem in multi-agent systems has been intensively studied in the control community and has been applied in vehicle formation, sensor networks, social networks [39], [41]. IoT has been rapidly developed during recent years. Its applications have the potential to affect every aspect of daily human life [42], many consensus problems exist in this field, for example, resource allocation [43], task allocation [44], [45], decision making in service-oriented [46].
There are numerous examples of consensus in economics. Prediction is an essential activity in several business processes, but it becomes challenging in the case that historical data are not available, such as forecasting demand for a new product. A prediction market that aggregates the opinions of the crowd is an efficient approach to solve such problems [47]. Another example is the blockchain that creates a rapid change in economics, such as transactions, accounting, contracts, and records [48], [49]. The consensus problem forms the foundation of blockchain [50], [51], and consensus algorithms maintain the existence of blockchain, for example, PoW, PoS, and PoB [52].
In the medicines field, in the 1950s, consensus problems were born from a wish to synthesize clinician and expert opinions on clinical practice and research programs [53]. Nowadays, the consensus is widely used in this field, and it plays an essential role in diagnosis, management, treatment, health service, etc. [28], [29].
In general, there are three approaches to resolve consensus problems in previous studies [6]. In the axiomatic approach, many axioms are used to specify the conditions that should be met by consensus functions. First, seven conditions for the consensus functions were presented. Then, a set of ten postulates for consensus functions were defined, in which the postulates 1-Optimality and 2-Optimality are essential for determining consensus. The constructive approach resolves the consensus problem in two aspects: the relationship between elements and the structure of them. In the optimization approach, some optimality rules are used to define consensus functions [6].
In general, the consensus is the reasonable choice if the conflict participants refer to the same problem [3]. Assume that the opinions included in the conflict content reflect an unknown solution to a problem. We call this solution the proper solution to the problem, and the following two cases may take place [3]: 1) The proper solution is independent of the participants' opinions, such as the problem of predicting the GDP of a country. 2) The proper solution is dependent on the participants' opinions, such as the problem of the US presidential election. In the first case, the proper solution to the problem exists; however, the participants do not know it. Thus, the participants may ''guess'' the proper solution [3]. In the second case, the participants' opinions decide the solution. In the two cases mentioned above, the consensus seems to have to satisfy the following conditions: • It should best reflect the given versions, and/or • It should be a good compromise that could be acceptable to the participants. The first condition is appropriate for the first case. The reason is the opinions given by the participants that reflect the ''hidden'' and independent solution, but it is not known to what degree. The consensus should best reflect the participants' opinions. The best criterion for consensus choice is the criterion 1-Optimality. The second condition refers to the second case in which the proper problem solution is dependent on those given by the participants. Thus, consensus should best represent the participants' opinions, and it should reflect the participants' opinions to an equal degree. The best criterion for consensus choice is the criterion 2-Optimality. Consensus chosen by the criterion 2-Optimality is more uniform than that chosen by the criterion 1-Optimality [3], [5].
The postulate 2-Optimality is used for many applications [56], [57]. In bioinformatics, for example, this postulate is applied for the multiple structure alignment problem that is described as follows: ''Given a set of proteins X , compute a transformation (i.e., rotation and translation) for each protein, and generate a 2-Optimality consensus'' [58]. Heuristic algorithms have been proposed for solving this problem in [58], [59]. Let n be the maximum length of the k proteins; then, the time complexities of the algorithms are O m 2 k 2 or O kn 2 + kn 2 , depending on the initial consensus [58], [59]. The problem of generating a consensus tree from a given collective of phylogenetic trees is used to reconstruct the evolutionary history of a set of organisms, and the postulate 2-Optimality is used for determining the consensus tree [60]. The time complexity of the algorithm MW and ADDTREE is O n 5 ; that of algorithm FITCH is O n 4 ; and that of the algorithms UNJ and NJ are O n 3 , where n is the number of organisms [61].
In e-commerce, decision-making has become a necessary component of business activity [62]. In a typical case, the structure of decision representation is a set of decision elements that describes an economic problem. These elements are ordered in sequence to process during decision realization. The 2-Optimality consensus is used to develop multiagent decision support systems. In [57], a heuristic algorithm with a computational complexity of O n 2 m +O (3nm) is introduced. This algorithm is implemented in the stock exchange and a-Trader multi-agent system [62], [63].
Determining the 2-Optimality consensus for binary vector collectives is an NP-hard problem, and the computational complexity of the brute-force algorithm to determine the optimal consensus is O(n2 m ); thus, heuristic algorithms have been developed [19]. First, the basic heuristic algorithm was introduced, and the computational complexity of this algorithm is O(m 2 n). Then, two additional heuristic algorithms were developed based on the basic heuristic algorithm: algorithms H2 and H3. In the algorithm H2, the initial consensus is set to the 1-Optimality consensus of collective X . The time complexity of algorithms H2 and H3 are O(m 2 n). During experiments, the basic heuristic algorithm is 3.8% faster than the algorithm H2 and 3.71% faster than the algorithm H3 [19]. Besides, the difference among consensus quality of the basic heuristic algorithm, that of the algorithm H2, and that of the algorithm H3 are not statistically significant [19].
Genetic algorithms were used to determine the 2-Optimality consensus for a binary vector collective. In genetic algorithm Gen1, the authors applied a roulette wheel choice, where the fitness function is g(x) = 1/(d(x, X ) + 1). In genetic algorithm Gen2, the authors utilized a tournament choice, where g(x) = d(x, X ) is the fitness function. These two genetic algorithms were compared with the basic heuristic algorithm. During the experiment, the algorithm Gen1 is 99.47% slower than the basic heuristic algorithm, and the algorithm Gen2 is 99.56% slower than the basic heuristic algorithm. Besides, the consensus quality of the algorithm Gen1 is 1.34% lower than that of the basic heuristic algorithm, and the consensus quality of the algorithm Gen2 is 0.01% higher than that of the basic heuristic algorithm. However, both two genetic algorithms are not practical [19] because of their colossal running time.
In [31], a heuristic algorithm was based on the basic heuristic algorithm and vertical partition. First, the vertical partition was used to divide the collection into two parts. Then, the 2-Optimality consensuses of these two parts were determined. Finally, the 2-Optimality consensus of the collective was determined by combining the 2-Optimality consensuses of the two parts. The time complexity of this algorithm is O(m 2 n).
The basic heuristic algorithm is the fastest in the literature. The consensus quality of the algorithm Gen2 is higher (0.01% only) than that of the basic heuristic algorithm. However, the algorithm Gen2 is not practical because of its unacceptable running time [19]. The basic heuristic algorithm is the best algorithm for determining the consensus for binary vector collectives, and it is used to compare with the QADC algorithm.

III. BASIC NOTIONS A. COLLECTIVE AND COLLECTIVE KNOWLEDGE
Let U be a finite set of objects representing all potential knowledge states for a given problem. In the set U , elements can contradict each other. Let b (U ) denote the set of all b-element subsets with repetitions of set U for b∈N , and let Thus, (U ) is the finite set of all nonempty subsets with repetitions of set U · A set X ∈ (U ) is considered a collective, where each element x ∈ X represents the knowledge state of a collective member [3], [5].
Collective knowledge or consensus of a collective is understood as a representative of this collective. The two most popular criteria for determining consensus are 1-Optimality and 2-Optimality. For a given collective X ∈ (U ), the consensus of X is determined by the following: where x * is the consensus of X , d (x * , X ) is the sum of distances from the consensus x * to members of the collective X , and d 2 (x * , X ) is the sum of squared distances from the consensus x * to members of the collective X [6].
Definition 1: A binary vector collective is defined as follows: where n is the number of collective members, and x i (i= 1, 2, . . . ,n) is a binary vector of length m.
Each element x i ∈ X is described as follows: For s, t ∈ U , the distance function is described as follows: where s = (s 1 , s 2 , . . . , s m ), and t = (t 1 , t 2 , . . . , t m ). Definition 2: The sum of distances from vector x c with a length of m to a binary vector collective X is defined as follows: The sum of squared distances from vector x c with a length of m to the binary vector collective X is defined as follows:

B. BASIC HEURISTIC ALGORITHM
Firstly, this algorithm randomly generates one initial candidate consensus x c . Then it determines the value of components of consensus x c . The basic heuristic algorithm to determine the 2-Optimality consensus for a binary vector collective is presented as follows: The computational complexity of the basic heuristic algorithm is O m 2 n .

IV. QUICK ALGORITHM TO DETERMINE 2-OPTIMALITY CONSENSUS
This section introduces a new method to calculate the distances from a candidate consensus to the members of a binary vector collective. Based on this method, we propose a new algorithm to determine the 2-Optimality consensus. The computational complexity of the QADC algorithm is O(mn). The consensus quality of the basic heuristic algorithm and that of the QADC algorithm is the same if their initial candidate consensuses are the same.

A. METHOD FOR CALCULATING DISTANCES
The sum of squared distances from a candidate consensus x c to collective members is computed by (3). The computational complexity of calculating value d 2 (x c , X ) depends on the computational complexity of calculating values d (x c , x i ). The distance between x c and x i for i= 1, 2, . . . ,n is computed by (1). The computational complexity of calculating value d 2 (x c , X ) can be reduced if the computational complexity of calculating values d (x c , x i ) reduces. Thus, an efficient method for calculating values d (x c , x i ) need to be studied.
. ,x m w be binary vectors of the same length. Assume that if x u and x w only differ in the k th component, then 1, 1, 0, 1, 1), x w = (0, 1, 0, 1, 1, 0 If x u = (0, 0, 0, 1, 1, 0), x u and x w differ in the 2 nd component: 1, 1, 1, 1, 0), x u and x w only differ in the 3 rd component: Theorem 2: Given a binary vector collective X = {x 1 , x 2 , . . . , x n }, the length of each member is m. Assume that x u and x w are binary vectors of length m and differ in the k th component. If d(x i , x w ) is determined, then d(x i , x u ) can be determined for i = 1, 2, . . . , n.
Example 2: Given x w = (0, 1, 0, 1), x u = (1, 1, 0, 1) and . As x u and x w only differ in the 1 st component (k = 1), based on Theorem 2, we have: calculate if d 2 (c, X ) < tO 2 then 7. In the proposed algorithm, components of a candidate consensus x c are sequentially determined from 1 st to m th . Firstly, x c is randomly created, and values of d (x c , x i ) for i= 1, 2, . . . ,n are computed by (1). In the next steps, the distances between x c and collective members are computed based on Theorem 2 and the results of the previously adjacent step.
The QADC algorithm is described in Fig. 1. In this figure, the white boxes represent the values of components randomly created at step Initialization, the green box in each step presents a component that its value is changed, the red box in each step represents that the value of this component is determined.
The schema of the QADC is described in Fig. 2. In this algorithm, tO 2 is the sum of squared distances from a current candidate consensus x c to collective members, and nO 2 is the sum of squared distances from a new candidate consensus to collective members. In each step, tO 2 and nO 2 are computed. If nO 2 is larger than or equal to tO 2 , the new candidate consensus is discarded. Otherwise, the new candidate consensus is considered as the current candidate consensus.
After step m, the current consensus is the consensus of the collective X .   The QADC algorithm is presented as follows: The main difference in the calculation of each step (from step 1 to step m) between the basic heuristic algorithm and the QADC algorithm is shown in Fig. 3.

Theorem 3: The computational complexity of the QADC algorithm is O(mn).
Theorem 4: The 2-Optimality consensus determined by the basic heuristic algorithm and that by the QADC algorithm is the same if initial candidate consensuses are the same.

V. EXPERIMENT AND ANALYSIS
This section assesses the efficiency of the QADC algorithm. We compare the QADC algorithm and the basic heuristic algorithm. The basic heuristic algorithm is chosen because it is the best algorithm for previous studies, and it is being used widely. These two algorithms are compared both in two respects: running time and consensus quality. The main tasks in this section are described as follows: • Measuring the consensus quality of the QADC algorithm, • Comparing the consensus quality of the QADC algorithm and that of the basic heuristic algorithm, • Comparing the running time of the QADC algorithm and the running time of the basic heuristic algorithm. The consensus quality of a heuristic algorithm is estimated as follows: where x * is the consensus determined by the heuristic algorithm, and x opt is the optimal consensus determined by the brute-force algorithm.

end END
In this study, the significance level is chosen to be 0.05 (α = 0.05).

A. MEASURING THE CONSENSUS QUALITY
We have used the hepatocellular carcinoma dataset 1 in this study. This dataset is heterogeneous and hence comprised of 205 real patients diagnosed with hepatocellular carcinoma. Each patient has 23 binary attributes, and each attribute represents a specific disease. We consider that each patient is represented as a binary vector with a length of 23. If a patient has a specific disease, the value of corresponding component is 1, otherwise 0.
If we perform the QADC algorithm on this dataset one time, one consensus is generated, and its quality is computed by (4). In the experiment, a set of consensus qualities create a sample, and this sample is used to measure the consensus quality of the QADC algorithm. We need to determine how many times to perform the QADC algorithm. The number of times performed by the QADC algorithm on this dataset is also the size (sz) of the sample.
We need to compute the consensus quality of the QADC algorithm with a margin of error (E) of 0.002 and a confidence interval of 95%. The sample deviation of the pilot survey (s) was 0.02. The sample size is determined below [64]: The confidence interval was 95%, then Z = 1.96. We have We can choose sz = 485. We perform the QADC algorithm on the hepatocellular carcinoma dataset 485 times. After every run of the QADC algorithm, a consensus is generated. The quality of this consensus is determined by (4). Finally, we obtain a consensus quality sample with a size of 485. This sample is shown in Table 1. The mean of the sample is 0.976, so the consensus quality of the QADC algorithm is 0.976 ± 0.001.
The boxplot of the sample consensus is shown in Fig. 4. The maximum consensus quality is 1.000, and the minimum consensus quality is 0.951. The mean of the sample is 0.976.

B. EVALUATING THE CONSENSUS QUALITY
In this section, we continue our experiment by comparing the consensus qualities of the QADC and the basic heuristic algorithm. The sample consensus of the QADC algorithm has been presented in Table 1.
The sample consensus in basic heuristic algorithm is computed as follows. We perform the basic heuristic algorithm to find the consensus for the hepatocellular carcinoma dataset. This algorithm is performed on the hepatocellular carcinoma dataset 485 times. After each run of the algorithm, a consensus is generated. The quality of this consensus is computed by (4). Finally, we obtain a consensus quality sample of the basic heuristic algorithm with a size of 485, and this sample is shown in Table 2.
We utilize the Shapiro-Wilk test to find the distribution of the two consensus quality samples. The significance level is chosen to be 0.05. The p-values of two samples are less than 0.05 (p − value = 0.0015 and p − value = 0.0381 for the sample consensus of the QADC algorithm and the sample consensus of the basic heuristic algorithm, respectively). Thus, there is evidence that these two samples do not come from a normal distribution.
The hypotheses to compare the consensus quality of these two algorithms are presented as follows: • Hypothesis H 0 : The difference in consensus quality between the QADC algorithm and the basic heuristic algorithm is not significant.
• Hypothesis H 1 : The difference in consensus quality between the QADC algorithm and the basic heuristic algorithm is significant. Since the two samples do not come from a normal distribution, the Wilcoxon rank-sum is utilized for this test.
We obtain p − value = 0.0952. The p − value is greater than 0.05; thus, the hypothesis H 0 cannot be rejected. It means that the difference in consensus quality between the QADC algorithm and the basic heuristic algorithm is not statistically significant.
In Theorem 4, we prove that if initial candidate consensuses x c are the same, then the 2-Optimality consensus determined by the QADC algorithm and that by the basic heuristic algorithm are the same. Besides, the analysis of experimental results shows that the difference between the consensus quality determined by the QADC algorithm and that the basic heuristic algorithm is not statistically significant.
Similar to the basic heuristic algorithm, the QADC algorithm keeps track of one initial candidate consensus to determine the consensus of a collective. Keeping track of one initial candidate consensus results in the moderate consensus quality of the QADC algorithm. That is the main limitation of the QADC algorithm. One approach to solve this limitation keeps track of many initial candidate consensuses. Besides, the diversity of initial candidate consensuses is carefully considered.

C. EVALUATING THE RUNNING TIME
The basic heuristic algorithm is the fastest in the literature. Therefore, we compare the running time of the QADC algorithm and that of the basic heuristic algorithm. Two datasets are generated randomly. We obtain two samples: one sample running time of the QADC algorithm and one sample running time of the basic heuristic algorithm. Table 3 shows the two samples and the ratio of the basic heuristic algorithm's running time to the QADC algorithm's running time.
We utilize the Shapiro-Wilk test to find the distribution of the two samples. The p-values of the two samples are larger than 0.05 (p−value = 0.8881 and p−value = 0.7852 for the sample running time of the QADC algorithm and the sample running time of the basic heuristic algorithm, respectively). It means that these samples come from a normal distribution.  The hypotheses to compare the running time of the two algorithms are stated as follows: • Hypothesis H 0 : The difference in running time between the QADC algorithm and the basic heuristic algorithm is not significant.
• Hypothesis H 1 : The difference in running time between the QADC algorithm and the basic heuristic algorithm is significant. Paired samples come from a normal distribution; therefore, we use the paired t-test for this test. We obtain p − value = 0.000007. Since the p-value is less than 0.05, the hypothesis H 0 is rejected. It means that the difference in running time between the QADC algorithm and the basic heuristic algorithm is significant. Thus, their means are compared.
The mean of the sample running time of the QADC algorithm and that of the basic heuristic algorithm are 0.059757 and 0.353071, respectively. The average running time for the QADC algorithm occupies 16.92% 0.059757 0.353071 × 100% that of the basic heuristic algorithm. In simple words, the running time of the QADC algorithm equals to 16.92% that of the basic heuristic algorithm.
Second, we perform the QADC algorithm and the heuristic algorithm on dataset 2. We obtain a sample running time of the QADC algorithm and a sample running time of the basic heuristic algorithm. Table 4 shows these two samples and the ratio of the running time of the basic heuristic algorithm to that of the QADC algorithm.
The Shapiro-Wilk test is utilized to find the distribution of the two samples. The p-values of the two samples are larger than 0.05 (p−value = 0.9142 and p−value = 0.7804 for the sample running time of the QADC algorithm and that of the basic heuristic algorithm, respectively). It means that these samples come from a normal distribution.
The hypotheses for comparison of the running time of the two algorithms are stated as follows: • Hypothesis H 0 : The difference in running time between the QADC algorithm and the basic heuristic algorithm is not significant.
• Hypothesis H 1 : The difference in running time between the QADC algorithm and the basic heuristic algorithm is significant. The paired t-test is used for this test. We obtain p−value = 0.000005. Because the p-value is less than 0.05, the hypothesis H 0 is rejected. The difference in running time between the QADC algorithm and the basic heuristic algorithm is significant. Their means are compared.
The mean of the sample running time of the QADC algorithm and that of the basic heuristic algorithm are 0.083336 and 0.596529, respectively. The average running time for the QADC algorithm occupies 13.97% 0.083336 0.596529 × 100% that of the basic heuristic algorithm. In other words, the running time of the QADC algorithm equals to 13.97% that of the basic heuristic algorithm.
The computational complexity of the basic heuristic algorithm is O m 2 n , and the computational complexity of the QADC algorithm is O (mn). In the case of dataset 1, the length of the vector is 23, so the computational complexity of the basic heuristic algorithm and that of the QADC algorithm are O 23 2 n and O (23n) , respectively. The running time of the QADC algorithm equals to 16.9% that of the basic heuristic algorithm. In the case of dataset 2, the length of the vector is 30, so the computational complexity of the basic heuristic algorithm and that of the QADC algorithm are O 30 2 n and O (30n) , respectively. The running time of the QADC algorithm equals to 13.97% that of the basic heuristic algorithm. The QADC algorithm is faster than earlier basic heuristic algorithm. The achived results of these experiments conform the efficiency and the computational complexity of the two algorithms.

VI. CONCLUSION
In this study, we proposed a fast algorithm to determine the 2-Optimality consensus for binary vector collectives based on a new method to calculate the distance from a candidate consensus to members of the collective. The computational complexity of the QADC algorithm was reduced to O (mn). In the experiment, the case length of the collective members is 30, and the running time of the QADC algorithm equals to 13.97% that of the basic heuristic algorithm. The running time of the QADC algorithm equals to 16.9% that of the basic heuristic algorithm in the case length of collective members is 23. Besides, the difference between the consensus quality of our proposed QADC algorithm and the basic heuristic algorithm, the best algorithm in the literature, is not statistically significant.
In future work, we will consider keeping track of many initial candidate consensuses with high diversity to enhance the consensus quality of the QADC algorithm.

APPENDIX
In this section, we present proofs of Theorem 1, 2, 3 and 4.
Proof of Theorem 1: We have Because x u and x w are only different in the k th component, we obtain We have Thus, we have the following: Two vectors x w and x u are only different in the k th component. From Theorem 1, we have Distance d(x n , x w ) is determined. Two vectors x w and x u are only different in the k th component. From Theorem 1, we have From the aforementioned analyses, it is clear that It means that distances between x u and members of the collective X can be determined based on distances between x w and members of the collective X . Proof of Theorem 4: In step Initialization, the value of x c in the basic heuristic algorithm and the value of x c in the QADC algorithm are equal. After step Initialization, the sum of squared distances between x c and the collective's members in the basic heuristic algorithm equals to that in the QADC algorithm.
At step 1 (k = 1), after changing the values of the 1 st component of vectors x c , the vectors x c in both algorithms are equal. Thus, the sum of squared distances between x c and collective members in the basic heuristic algorithm equals to that in the QADC algorithm. Therefore, in the vectors x c , the chosen values of the 1 st component in two algorithms are the same. After step 1, the vectors x c in two algorithms are equal.
At step 2 (k = 2), after changing the values of the 2 nd component of vectors x c , the vectors x c in both algorithms are equal. Thus, the sum of squared distances between x c and the collective members in the basic heuristic algorithm equals to that in the QADC algorithm. Therefore, in the vectors x c , the chosen values of the 2 nd component in two algorithms are the same. After step 2, the vectors x c in two algorithms are equal.
Assume that after step p (p ≤ m − 1), the vectors x c in two algorithms are equal. Now, we prove that after step p + 1, the values of vectors x c in two algorithms are equal.
At step p + 1, after changing the values of the (p + 1) th component of vectors x c , the values of x c in both algorithms are equal. Thus, the sum of squared distances between x c and the collective members in the basic heuristic algorithm equals to that in the QADC algorithm. Therefore, in the vectors x c , the chosen values of the (p + 1) th component in the two algorithms are the same. After step p + 1, the vectors x c in two algorithms are equal.
DAI THO DANG received the M.S. degree in computer science from the University of Nice Sophia Antipolis, Nice, France. He is currently pursuing the Ph.D. degree in computer science with Yeungnam University, Republic of Korea, in cooperation with the Wrocław University of Science and Technology, Poland. After working in the industry for several years, he joined the University of Danang, Vietnam, as a Lecturer, in 2011. His research interests include collective intelligence, knowledge integration methods, algorithm, consensus theory, and inconsistent knowledge processing. He is an active Reviewer of IEEE TRANSACTIONS ON CYBERNETICS.
NGOC THANH NGUYEN (Senior Member, IEEE) is currently a Full Professor with the Wrocław University of Science and Technology, and the Head of the Information Systems Department, Faculty of Computer Science and Management. He has edited more than 30 special issues in international journals, 52 books, and 35 conference proceedings. He is the author or coauthor of five monographs and more than 350 journal articles and conference papers. His research interests include collective intelligence, knowledge integration methods, inconsistent knowledge processing, and multi-agent systems. He was a General Chair or a Program Chair of more than 40 international conferences. He serves as an Expert of the National Center of Research and Development and European Commission in evaluating research projects in several programs, such as Marie Sklodowska-Curie Individual Fellowships, FET, and EUREKA. He has given 20 plenary and keynote speeches for international conferences, and more than 40 invited lecturers in many countries. In 2009, he was granted of title Distinguished Scientist of ACM. He was also a Distinguished Visitor of IEEE and a Distinguished Speaker of ACM. He also serves as the Chair for IEEE SMC Technical Committee on Computational Collective Intelligence. He also serves as the Editor-in-Chief for International Journal of Information and Telecommunication (Taylor & Francis), Transactions on Computational Collective Intelligence (Springer), and Vietnam Journal of Computer Science (Springer). He is also an associate editor of several prestigious international journals.
DOSAM HWANG received the Ph.D. degree from Kyoto University, Kyoto, Japan. He has served as the Head for the Yeungnam University's Computer Engineering Department for five years from 2005 to 2009. He held a position as a Principal Researcher at the Korea Institute of Science and Technology (KIST) and has been a Visiting Professor with the Korea Advanced Institute of Science and Technology (KAIST). He is currently a Full Professor with the Department of Computer Engineering, Yeungnam University, Republic of Korea. His research interests include natural language processing, ontology, knowledge engineering, information retrieval, and machine translation. He has so far been not only a co-chair of several international conferences but also a steering committee member of ICCCI and ACIIDS, and MISSI international conferences. More specifically, for example, he had been the Assistant Secretary of ISO/TC37/SC4 for language resource management from 2005 to 2007, and the Secretary of Korean TC for ISO/TC37/SC4 at the same time. In 2006, he was the Director of the Korean Society for Cognitive Science (KSCS), and the Korean Information Science Society (KISS), he has also been serving as the Society's Director and the Mentor of a knowledge engineering study group since 2007. In addition to this, he has also participated in several Korean national research projects, such as a project on machine translation system from 1985 to 1990, and the national IT ontology infrastructure and technology development project called ''CoreOnto'' from 2006 to 2009, and ''Exobrain'' from 2013 to 2014, the project focused on the construction of deep knowledge base and question-answering platform. He has also been in charge of an intelligent service integration based on IoT Big Data as part of Korea's another principal national research project ''BK+'' since 2014. In recognition of his such great commitment and contribution to the relative fields of study, he had also been honored as a Distinguished Researcher of KIST in 1988 by the Korea's Ministry of Science and Technology (MoST) and awarded a prize for Good Conduct from the Kyunghee High School in 1973. He had more than 50 publications. VOLUME 8, 2020