Inferring Preferences for Multi-Criteria Ordinal Classification Methods Using Evolutionary Algorithms

Multicriteria sorting involves assigning the objects of decisions (actions) into $a$ priori known ordered classes considering the preferences of a decision maker (DM). Two new multicriteria sorting methods were recently proposed by the authors. These methods are based on a novel approach called interval-based outranking which provides the methods with attractive practical and theoretical characteristics. However, as is well known, defining parameter values for methods based on the outranking approach is often very difficult. This difficulty arises not only from the large number of parameters and the DM’s lack of familiarity with them, but also from imperfectly known (even missing) information. Here, we address: i) how to elicit the parameter values of the two new methods, and ii) how to incorporate imperfect knowledge during the elicitation. We follow the preference disaggregation paradigm and use evolutionary algorithms to address it. Our proposal performs very well in a wide range of computational experiments. Interesting findings are: i) the method restores the assignment examples with high effectiveness using only three profiles in each limiting boundary or representative actions per class; and ii) the ability to appropriately assign unknown actions can be greatly improved by increasing the number of limiting profiles.


I. INTRODUCTION
Among the different types of problems addressed by the multiple-criteria decision analysis (MCDA) approaches, the multiple-criteria ordinal classification, or sorting problem, has received a great interest lately given its interesting theoretical challenges and its applicability in real scenarios. In multiple-criteria ordinal classification, a set of decision alternatives (objects of decision, actions) must be assigned The associate editor coordinating the review of this manuscript and approving it for publication was Li Zhang . to a set of classes. These classes, also called categories in the related literature, have been predefined and ordered using the decision maker's (DM) preferences. In this paper, we are interested in multi-criteria ordinal classification methods inspired on the outranking approach. In most cases, the definition of each class can be made through a reference decision action (profile) that can be used as a characteristic action to represent the class as in ELECTRE TRI-C [1] or as a limiting boundary that separates a pair of classes as in ELECTRE TRI-B [2]. Then, to perform the assignment of new actions, both the profiles and the actions-to-be-assigned are evaluated by the DM based on a set of conflicting criteria. With the aim to provide a better description of the classes, ELECTRE TRI-C, (respectively ELECTRE TRI-B), was extended to ELECTRE TRI-nC, (respectively, ELECTRE TRI-nB), in [3] (respectively, [4]); these extensions use a set of profiles in the definition of each class (respectively, boundary). So, it is possible to consider more pieces of information regarding the relations between actions to be assigned and reference profiles to, potentially, provide better decision aid.
In ELECTRE methods, the elicitation of model's parameters is a real concern. When using a direct elicitation method, the DM, commonly aided by a decision analyst, must explicitly set the parameter values representing his/her preferences. Several authors (for example, [5], [6], [7]) have argued against the direct elicitation since: i) the DM may not be able to completely understand the meaning of the model's parameters; ii) the DM may not be accessible to involve in a long and complex process of providing appropriate numerical values, which usually are very unfamiliar to her/him; and iii) the DM may be a collective entity with conflicting values and ill-defined preferences. Alternatively, when using an indirect elicitation method, (e.g., the so-called preference disaggregation analysis), the DM typically uses his/her holistic judgments to provide/accept a set of reference examples inherently containing the DM's preferences; thus, through a regression-inspired procedure, a process of extraction must be performed in order to infer the parameter values underlying the preferences contained in the reference examples. Under some strong simplifications, the extraction of outranking model parameters using an indirect elicitation method can be addressed through classical mathematical programming techniques as in [8]. But such an indirect parameter elicitation becomes a very complex optimization problem when veto thresholds should be inferred; this is because inferring all the parameters simultaneously requires solving non-linear optimization problems with nonconvex constraints [8]. In such cases, evolutionary algorithms should be used as in [7], [9], and [10]. These works have found that the non-linearity of the problem together with complex constraints are usually better handled by evolutionary algorithms than other exhaustive and/or metaheuristic approaches. Less sophisticated metaheuristic approaches may be used when the preference model does not include veto, as in [11]. However, this type of approaches neglects information required to encompass important features of reality such as the capacity to identify veto situations when comparing two actions (that is, when an action is so bad on a given criterion that it cannot be better than the other action in general terms, regardless of their evaluation on the other criteria). Fernández et al. [12] proposed an evolutionary algorithm to infer the whole set of ELECTRE TRI-nB model parameters. However, only the so-called pseudo-conjunctive method was used in that work; and a single decision rule is then used for the optimization process. As explained below, further pieces of information can be used to improve the decision process. To our present knowledge, there is no indirect parameter elicitation method for ELECTRE TRI-nC. Perhaps this is due to the inference process being more complicated than in the case of ELEC-TRE TRI-nB, since the former requires working with two decision rules that are equivalent by the transposition operation (consisting of reversing both the order of categories and the sense of preferences) [1], [13].
As stated in [14], indirect elicitation methods are generally attractive for the DM, but, to a great extent, their performance is degraded when there is scarce information about the DM's preferences (a relatively small reference set of decision examples): in this case, the indirect elicitation methods often suggest many solutions in the parameter space [15]. All these distributed solutions satisfy the known preferences of the DM. This is imprecise information that should be modelled in an appropriate way.
Thus, imprecise (maybe arbitrary) setting of the outranking model's parameters may be a result of either a direct or indirect elicitation process. For a better model of human hesitancy, many extensions of outranking methods have been proposed that use fuzzy-based approaches (e.g., intuitionistic fuzzy sets, hesitant fuzzy sets, interval-valued fuzzy numbers, etc.) [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26]. It should be underlined that all these fuzzy-based extensions of outranking methods are devoted to solving choosing and ranking decision problems. To the best of our knowledge, there is no fuzzy extension of the ELECTRE TRI-nC and ELECTRE TRI-nB methods.
To deal with imprecise information in model parameters and criterion scores, Fernández et al. [27] recently developed two multi-criteria ordinal classification approaches, INTERCLASS-nB and INTERCLASS-nC, which extend ELECTRE TRI-nB and ELECTRE TRI-nC to the interval framework. As Fernández et al. [27] argue, there are situations where imperfect knowledge about the parameter values may be characterized in a natural way by interval numbers, which are representations of magnitudes of unknown precise values. Since setting a parameter value as an interval number is easier than as a precise value, INTERCLASS-nB and INTERCLASS-nC reduce the difficulty of direct elicitation of model parameters. However, when the DM is a mythical entity (e.g., the public opinion), or an inaccessible person (e.g., the CEO of a multinational enterprise), INTERCLASS-nB and INTERCLASS-nC should be complemented by an indirect elicitation procedure, which, learning from decision examples, allows to define the parameter values of the model as interval numbers.
The research problem is defined through the following objective and research questions. Given the difficulty of a direct elicitation of model parameters, the aim of this paper is to develop an effective method that allows to identify the parameter values from some assignment examples in the context of the INTERCLASS-nB and INTERCLASS-nC methods. Some parameters may be directly set as interval numbers, whereas others may be inferred. Thus, this paper combines two ways to reduce the difficulty of parameter elicitation. Our approach is extensively assessed, in-sample VOLUME 11, 2023 and out-of-sample, in its ability to restore the assignment examples and the capacity to make consistent assignments of new actions.
We pretend to answer the following research questions: -Concerning INTERCLASS-nB (respectively INTERCLASS-nC), which is the appropriate number of limiting profiles (respectively, characteristic or representative actions) to achieve a good characterization of the limiting boundary between adjacent classes (resp. the related class)? -How much is the effectiveness of the inference approach depending on the number of criteria and classes? -How much is the effectiveness depending on the number of assignment examples? -What is the capacity of the inference approach to, regarding each method, learn from the assignment examples provided by the DM. -How does the robustness of the inference approach behave with respect to many diverse preference models (DMs)? The novelty of this proposal rests on the following bases: • This is the first approach to infer preference model parameters of a multi-criteria ordinal classification method in which classes are described by representative actions, as in ELECTRE TRI-nC; • Until now, there were two alternative ways of dealing with the difficulty of eliciting the preference model parameters: modeling imprecision and inferring parameters from decision examples. This paper presents the first approach in which both forms are combined; • INTERCLASS-nB and INTERCLASS-nC are here fully characterized from a preference disaggregation paradigm, and the effectiveness of both methods are compared in a wide range of problems; interesting conclusions follow from such a comparison. The first question was kept open in [27]. Since finding formal theoretical answers to the above questions may be impossible, we perform a simulation experiment in which a wide range of DM preferences are considered, and the effectiveness of the inference method is characterized for different instances of classes, criteria, and number of assignment examples. We extend accuracy measures from the literature to properly characterize the effectiveness of the model; however, these new measures imply complex optimization problems that must be addressed through metaheuristics. These techniques have been widely used in several contexts (e.g., [28], [29], [30]). Particularly, we exploit the canonical version of a genetic algorithm to address the proposed optimization problems.
The remainder of this document is presented as follows. In Section II, we give a brief description of the INTERCLASS-nB and INTERCLASS-nC methods as well as the interval outranking approach, a fundamental component of these methods. Section III presents our main proposals on how to infer the parameter values of both methods.
In Section IV, we describe an extensive computational experiment and its results evaluating our elicitation methods. Finally, Section V concludes this paper.

A. FUNDAMENTAL NOTIONS ON INTERVAL NUMBERS
The main concept of interval analysis theory [31], [32] is the so-called interval number. We now present a description of such a concept.
An interval number describes a quantity not necessarily defined whose real value lies within a range of real numbers, I = [I − , I + ]. The limits of this range, I − , I + ∈ R, are known. Thus, by definition, a real number r can be represented by the interval number R = r − , r + , where r = r − = r + . Furthermore, any real number i ∈ I is called realization of the interval number I. To state clearer definitions, in the rest of this document, interval numbers will be denoted by boldface italic letters.
In order to estimate the credibility degree of an interval number I = [i − , i + ] being greater than or equal to another interval number J = [j − , j + ], the following possibility function defined in [33] is used by [34]: The possibility function defined in (1) indicates that p(I ≥ J) is the credibility degree of the assertion ''given that both realizations are established, i ∈ I is not less than j ∈ J''. Thus, the possibility function denotes robustness of I ≥ J, even when these quantities are undetermined.

B. INTERVAL-BASED OUTRANKING APPROACH
Fernández et al. [34] proposed an extension of the outranking approach whose main feature is its ability to deal with the imperfect knowledge involved in the decision maker preferences and the impacts of actions on criteria. These types of imperfect information can be modeled to such an extension using both interval numbers and the traditional pseudocriteria based on discriminating thresholds (e.g., [35], [36]).
The formal definition of the interval outranking approach depends on the following notation. Let A be a set of potential actions. Each x ∈ A is evaluated on a family of N coherent criteria (as in the sense of [37]) G, which, without loss of generality, increase with preference. Now, assume that G 1 ⊆ G is the set of criteria whose imperfect knowledge can be modeled using discriminating thresholds as is traditionally done with later ELECTRE methods. And that G 2 ⊆ G is the set of criteria whose imperfect knowledge can be modeled using interval numbers; that is, each g j ∈ G 2 is an interval number of the form g j (x) = [g − j (x), g + j (x)]. The interval outranking approach requires the assignment of appropriate values to the following parameters to satisfactorily reflect the DM's preferences: • w j = w − j , w + j , the weight of criterion g j ; • v j = v − j , v + j , the veto threshold of criterion g j ; and • λ = [λ − , λ + ], that reflects a majority threshold. Where j = 1, · · · , N . Furthermore, it is a straightforward work for the DM to assign values to the preference threshold p j (·), and the indifference threshold, q j (·); p j (·) ≥ q j (·) ≥ 0. As in the classical outranking approach, the interval outranking approach estimates a credibility index, η (x, y) ∈ [0, 1], between pairs of actions of the assertion ''x is at least as good as y'', xSy. The detailed procedure used by the intervalbased outranking approach to estimate this credibility index is described in Appendix C.
The approach assumes that the DM uses a credibility threshold δ > 0.5 such that if η (x, y) ≥ δ then the assertion ''x is at least as good as y'' is accepted. Using this threshold, the following relations are defined.
The concept of dominance is also extended in [34]. In that work, dominance is not crisp, but there is a ''degree of credibility'', α, of the dominance.
Definition 2 Extended Dominance: Let x = y be two actions and α ∈ R; then y is α-dominated by x, denoted by xD(α)y, if and only if the following conditions are fulfilled: i g j (x) ≥ g j (y), for all g j ∈ G 1 , ii min

C. THE INTERCLASS-NB METHOD
Fernández et al. [34] proposed an extension of the outranking approach whose main feature is its ability to deal with the imperfect knowledge involved in the decision maker preferences and the impacts of actions oncriteria.

Condition 1:
Let C be a finite set of classes C = {C 1 , · · · , C k , · · · , C M }, M ≥ 2, ordered in increasing preference. In INTERCLASS-nB, the boundaries between categories C k and C k+1 are described by a set of limiting profiles, B k = b k,j , such that for given δ > 0.5 and λ > [0.5, 0.5] the following conditions are fulfilled: i. C k is defined through a set of reference upper limiting profiles, B k , and through a set of reference lower limiting profiles, B k−1 . It is assumed that all b k,j of B k are in C k+1 (that is, all classes are closed from below); ii. B 0 (respectively, B M ) is composed of the anti-ideal (respectively, the ideal) action; iii. For all k, there is no pair For all k and for each limiting action w in B k , there exists at least one action z in B k+1 in such a way that zD(α)w, α ≥ δ; vi. For all k and for each limiting action w in B k+1 , there exists at least one z in B k in such a way that wD(α)z, α ≥ δ. vii. For all k and for each limiting action w in B k , exists at least one z in B k+1 in such a way that zP (δ, λ) w, α ≥ δ. The following relations among profiles and decision actions are defined by INTERCLASS-nB: x (the latter should hold for at least one b k,j ). The assignment procedures constituting the INTERCLASS-nB method are based on the following two logics: Pseudo-conjunctive procedure i. Compare x to B k for k = M − 1, . . . , 0 until the first value, k, such that xS (δ, λ) B k ; ii. Assign x to class C k+1 . Pseudo-disjunctive procedure i. Compare x to B k for k = 1, . . . , M until the first value, k, such that B k P (δ, λ) x; ii. Assign x to class C k .

D. THE INTERCLASS-NB METHOD
We continue using the previous notation to present now a description of the INTERCLASS-nC method following [34].
In the INTERCLASS-nC method, the set of decision actions characterizing class } is the set of all the characterizing decision alternatives (R 0 , and R M +1 are composed of the anti-ideal and ideal actions, respectively). Assume a given δ > 0.5.

Condition 2:
Each element in R k must fulfill the following conditions: i. For all k and for each action w in R k , there is at least one action z in R k+1 such that zD (α) w (α ≥ δ). ii. For all k and for each action w in R k+1 , there is at least one action z in R k such that wD (α) z (α ≥ δ). iii. For all k and for each action w in R k+1 , there is no action z in R k such that zS (0.5, λ) w. VOLUME 11, 2023 The credibility index of the outranking relation of action x over the subset R k is defined as follows: While the credibility index of the outranking relation of subset R k over an action x is defined as follows: These credibility indices allow to build interval crisp outranking relations between decision actions and sets of characteristic actions as follows: The selection function is defined as The assignments of alternatives to classes are performed in INTERCLASS-nC using two conjoint rules, called the descending rule and the ascending rule, which should be used conjointly, as in both ELECTRE TRI-C and ELECTRE TRI-nC. We now describe these rules.
Descending assignment rule , then select C k as a possible class to assign x; otherwise, select C k+1 . iv. For k = 0, select C 1 as a possible class to assign x. Ascending assignment rule i. Compare x to R k for k = 1, . . . , M + 1, until the first value, k, such that R k S (δ, λ) x; ii. For k = 1, select C 1 as a possible category to assign action x.
, then select C k as a possible class to assign x; otherwise, select C k−1 .

III. AN INDIRECT ELICITATION FOR THE PARAMETERS OF THE INTERCLASS-NB AND INTERCLASS-NC METHODS
This section details the main aspects of the approach to infer the parameters of the two multi-criteria ordinal classification methods.
The framework of the inference procedure following the PDA paradigm consists of five stages. Its goal is to find an INTERCLASS-nB (respectively, INTERCLASS-nC) model that best reproduces the decision examples provided by the decision-maker.
-Stage 1 consists of defining the input data for each model: A coherent family of criteria [37], the sequence of classes, the set of actions (characterized by their performance on the criteria).
-Stage 2 is associated with the preference information from the DM, that is, each action to a class according to the preferences of the DM. It is important to note that these assignments could come from past decisions, from a subset of the actions that the DM originally required to assign to the classes, or from a set of fictitious actions consisting of actions that can be easily judged by the DM.
-Stage 3 concerns the definition of an optimization problem for each method, such that a specific accuracy measure is exploited for each of these methods (see Subsections III.A and III.B).
-Stage 4 uses an evolutionary algorithm to address the optimization problem(s) of Stage 3. This algorithm intends to find at least one set of parameter values compatible with the decision examples provided by the DM. If such a set is not found, the DM is asked to revisit her/his preference information. If the information determined by the decision-maker is consistent, then at least one compatible set of preference parameters with such information exists [38].
-Stage 5 assesses the set of parameter values found by the genetic algorithm. If the DM agrees with the recommendation, the procedure stops; otherwise, further information from the DM is required and/or it should be modified.

A. AN OPTIMIZATION-BASED INFERENCE METHOD OF THE INTERCLASS-NB METHOD
The credibility index of the outranking relation of x over y, η (x, y) depends on the values assigned to the parameters of the interval outranking model, = {w 1 , · · · , w n , v 1 , · · · , v n , λ, δ}; but setting a convenient set of parameters is not trivial. We present here a procedure in which, using a set of assignment examples (reference set) provided by the DM, it is possible to assign appropriate values to the parameters of the interval outranking model to satisfactorily represent the DM's preferences. We use the notation from Section II to define such a procedure.
Let T be a set of decision actions. We assume that each x ∈ T is assigned by the DM to a class C k , C = {C 1 , · · · , C k , · · · , C M }, M ≥ 2. Classes in C are ordered in increasing preference. Assignments of alternatives to classes are holistic decisions made by the DM; therefore, his/her multi-criteria preferences are reflected in them. We assume that these decisions can be represented by an INTERCLASS-nB model, { , B 0 , · · · , B M }; that is, by the parameters of the interval outranking model, , and M + 1 sets of limiting profiles. Since B 0 and B M are composed, respectively, of the anti-ideal and the ideal actions, we are interested in finding only an approximation to the set of actual preference parameters, nB DM = { , B 1 , · · · , B M −1 }. Therefore, the most appropriate set of inferred preference parameters to fit the assignments expressed by the DM, nB * inf , is the one that minimizes the number of inconsistencies with respect to the expressed preferences. Let  denote that the DM has assigned x to class C k , denote that x is assigned to class C k using the inferred decision model nB inf , and ξ nB be the set of models fulfilling Condition 1 and any constraints established by the DM. The optimization problem of minimizing the number of inconsistencies between nB DM and a given nB inf is equivalent to maximizing the following effectiveness measure: where NI nB DM , nB inf = x∈T NI x, nB DM , nB inf , and

B. AN OPTIMIZATION-BASED INFERENCE METHOD OF THE INTERCLASS-NC METHOD
In a similar order to the ideas presented in Subsection III.B, we describe here an inference method to obtain the parameter values of the INTERCLASS-nC method. Such a method also uses a set of assignment examples where the DM assigns actions to preferentially ordered classes. Let D be this set of actions, where each x ∈ D is assigned by the DM to an element of the set of classes C = {C 1 , · · · , C k , · · · , C M } or to a range of classes when the ascending and descending assignments are not the same. Therefore, our goal is to find a model of the DM's preferences by inferring a configuration of the INTERCLASS-nC method, nC * inf = * , R * 1 , · · · , R * M , that fulfills Condition 2 and is as consistent as possible with the assignments made by the DM.
Nevertheless, defining a fitness function here is not as straightforward as in the previous section. This is because each x is not necessarily assigned to only one class but a set of classes. Thus, if χ DM is the set of classes to which the DM has assigned x and χ inf is the set of inferred classes, if we define the accuracy as then, we might be too pessimistic since only one misclassification (perhaps among many classes) would lead to a total error. On the other hand, if we define then we might be too optimistic. Therefore, we use here the so-called F 1 -score [39], defined through precision, P, and recall, R, as [40]: F 1 -score = 2PR/(P + R). We adapt it to define the following optimization problem (cf. [11]): where Ac nC DM , nC inf = x∈D Ac x, nC DM , nC inf , and

C. AN EVOLUTIONARY ALGORITHM FOR ADDRESSING PROBLEMS (2) AND (3)
Given the nonlinearity of Problems (2) and (3) and the previous results published in several related research works (e.g., [7], [10], [12], [14]), we implement here a genetic algorithm to address these issues. Even when the main aspects of such algorithm are suitable for addressing both problems, there are some characteristics that are specific of each problem. Thus, we now describe the specific steps to follow. As in Section II, we assume that there are N criteria and M classes. Specific steps to configure nB inf : Individuals are represented by a real-valued vector composed of K = N (2 + J (M − 1)) + 1 genes as in Figure 1, where J is the number of profiles used to separate each pair of classes.
Specific steps to configure nC inf : For nC inf , individuals are represented by a real-valued vector composed of N (2 + OM ) + 1 genes as in Figure 2, where O is the number of profiles used to characterize each class.

D. GENERAL STEPS OF THE EVOLUTIONARY ALGORITHM
Each population used in the algorithm contains ps individuals. The individuals of the initial population of the algorithm VOLUME 11, 2023 The generation of the weights could be in several ways.
In the experiments below, we use the method presented in [41], where N − 1 numbers, u 1 , · · · , u N −1 , are uniformly randomly generated in (0, 1); later, these numbers are ordered in ascending order to calculate N values as Finally, the weights used in the integrated outranking approach could be defined as where w i is a value that copes with the imperfect knowledge in the DM's mind about the actual weight of the ith criterion.
In our algorithm, the selection of the parents is by binary tournament, and we adopt one-point crossover. To maintain consistency in the weights, we assume that all the genes corresponding to the weights form an indivisible unit. Thus, there are K −N possible crossover points as shown in Figure 3 to set nB inf and NOM + 1 to set nC inf as shown in Figure 4. Once the two selected parents are crossed in a randomly selected crossover point, one offspring individual is generated; such individual is then mutated with a given probability. The mutation of an individual consists of the random generation of each gene unity fulfilling the constraints set (4). At each generation of the algorithm, ps (population size) offspring individuals are generated and (possibly) mutated. All the offspring and parent individuals are entered into within a pool from which ps − 1 individuals are randomly selected to form the population in the next generation of the algorithm. We perform elitism in one individual per generation. The fitness of each individual is assessed based on the objectives of Problem (2) or Problem (3), as corresponds. After a given number of generations, the algorithm returns the chromosome that represents the feasible solutions with the best fitness values in the population; let these solutions form a set called best known . The chromosome representing best known is obtained as the centroid (average of the parameters) of the individuals within best known . If the centroid reaches the best-known fitness value, it is considered as the best solution.
If not, then the solution in best known closest to the centroid is considered as the best solution from the preliminary run. Such distance is calculated as the normalized Euclidean distance of the central values of the parameters. To reduce the effects of randomness, we performed twenty consecutive preliminary runs. Starting with the second run, we include the best solution from the previous run in the initial population.
As one of the classic algorithm setters from the related literature, we use ParamILS [42] to set the main parameters of our own algorithm; that is, the size of the population, the number of generations, the probability of crossover, and the probability of mutation. The values found by such configurator are, respectively, 200, 200, 60% and 2%. Therefore, these values are used in this work.
This procedure is formalized in Algorithm 1.

IV. NUMERICAL EXPERIMENTS
In this section, we present the procedure used to assess our inference approach. Such assessment intends to demonstrate the ability of our approach to infer the parameter values of the INTERCLASS-nB and INTERCLASS-nC models by establishing its effectiveness in i) reproducing the reference examples of the DM, and ii) appropriately making new assignments. The procedure to assess our approach is, first, to simulate a decision maker that is compatible with the pseudoconjunctive INTERCLASS-nB (respectively INTERCLASS-nC), and whose preference model parameters are known; second, using the known preference model, to assign a set of reference actions to ordered classes; third, to exploit the evolutionary algorithm of Subsection III.C by addressing Problems (2) or (3) in order to infer the parameter values of the pseudo-conjunctive INTERCLAS-nB or the INTERCLAS-nC methods; fourth, to obtain an in-sample effectiveness firstly using the inferred parameters to assign the reference Algorithm 1 Genetic Algorithm Proposed to Address Problems (2) and (3) Require:A set of reference examples, T Ensure:ρ final , individual representing the population with the best fitness values 1: i ← 1 2: ρ ← null 3: g ← 0 4: P g ← create-Initial-Population () {Evolving the solutions for 1000 generations} 5: for g < 1000 do 6: H g ← create-Offspring (P g , selection, crossover, mutation) 7: P g+1 ← generate-Population (P g ∪ H g ) 8: g ← g + 1 9: end for 10: best known ← find-Best (P g ) 11: ρ ← find-Centroid (best known ) 12: if ρ is-best (best known ) 13: ρ final ← ρ 14: else 15: ρ final ← find-closest (ρ) actions to the classes and, later, measuring the proportion of coincidences; fifth, to obtain an out-of-sample effectiveness by generating new actions and assigning them to classes using the known parameters of the model and the inferred parameters, finally measuring the proportion of coincidences.

A. EXPERIMENTAL INSTANCES
We created a set of experimental instances that would allow us to obtain sound conclusions. Each instance i used in the experiments below is constituted of a) an INTERCLASS-nB or INTERCLASS-nC model, b) a reference set T i containing m i assignment examples. So, each instance represents different preferences of the decision maker. We use 20 instances (i = 1, . . . , 20) to determine the results shown below.
Furthermore, we defined a wide variety of values in the experiment setup as shown in Table 1.
In our experiment setup, the values of the model parameters, shown in the set of Equations (4), were randomly generated fulfilling: where The definition of limiting profiles (respectively characteristic actions) must fulfill Condition 1 (resp. Condition 2).

B. ASSESSMENT PROCEDURE
We use the following assessment procedure: The assignment policy used by the nB DM model is the pseudo-conjunctive procedure. 5. Obtain, using the approach of Section III, a set of parameters nB * inf or nC * inf as consistent as possible with the assignments made by the corresponding simulated DM model. The maximum consistency is identified with the optimal solution to Problem (2) or Problem (3) and the optimization is performed using Algorithm 1. 6. Assign the actions in T to classes according to nB * inf (using the pseudo-conjunctive procedure) or the actions in D to classes according to nC * inf . Determine the in-sample effectiveness of nB * inf using the accuracy VOLUME 11, 2023  measure presented in Problem (2). Similarly, the in-sample effectiveness of nC * inf is determined through Problem (3). 7. Create a new set of potential actions and assign them to classes using nB DM and nC DM , and then determine the approach's out-of-sample effectiveness similarly to step 6.

C. RESULTS
Results are presented per method (INTERCLASS-nB and INTERCLASS-nC) and per type of experiment (in-sample and out-of-sample). Since out-of-sample results can be considered as more illustrative of the effectiveness of the approach, we present some graphs of these results in the main text; the graphs for in-sample results are shown in the appendices.

1) INTERCLASS-nB
It is important to note that the main goal of eliciting preference parameters is not to find the exact values of those parameters (if they even exist), but to determine those (not necessarily unique) values that reproduce the expressed preferences of the DM as well as possible.
We provide the results in terms of in-sample and out-ofsample effectiveness in reproducing the assignments from the set of simulated decision-maker preferences.

a: IN-SAMPLE EFFECTIVENESS
The results are shown in Figures 13 and 14 of Appendix A, where the error bars are equivalent to twice the standard deviation of the corresponding averages. More concentrated results are given in Tables 2-5.
Given the large number of experiments, mean effectiveness values should be normally distributed. In the following, we use the 2-sample t-test with a significance level of 0.05 and with the null hypothesis ''(H0:) The means of the results of two rows in the table are equal''. The null hypothesis was not rejected when comparing any pair of rows in Table 2,   providing evidence on the robustness of the approach regarding the number of criteria.
The effectiveness in dependence of the number of classes is provided by Table 3. The difference between each pair of these effectiveness was significant. These results provide little evidence indicating that increasing the number of classes has a negative effect on the effectiveness of the approach.
The effectiveness in dependence of the number of assignment examples per class is shown in Table 4. There is a statistical difference among all the effectiveness values in this table, showing that the effectiveness is a decreasing function of the number of objects per class.
The effectiveness of the approach in dependence of the number of limiting profiles is shown in Table 5. There is something interesting with the effectiveness values shown in this table. Statistical analysis shows that there is significant improvement in effectiveness when going from one to three profiles or when going from one to five. However, there is no evidence that increasing from three to five profiles improves effectiveness; therefore, the decision analyst should consider that it may be not worth increasing the cognitive effort of the DM.   higher numbers of criteria imply higher complexities of the problem. Table 7 analyzes the effectiveness based on the number of classes. There is a statistical difference between each pair of values in this table. Here, it is quite interesting how the effectiveness increases with increasing from two to three classes but decreases with increasing from three to five classes. This behavior is maintained regardless of the number of profiles or the number of assignment examples (see Figures 5 and 6). Thus, one might wonder if going from two to three classes increases the reference information providing more learning capacity to the approach without considerably increasing the complexity of the problem, but the new reference information does not compensate for the increase in complexity when going from three to five classes. This hypothesis will be evaluated in future works.

b: OUT-OF-SAMPLE EFFECTIVENESS
The effectiveness of the approach based on the number of assignment examples per class is provided in Table 8. According to the statistical analysis, the hypothesis ''the means of the results of two rows in the table are equal'' is rejected for all pairs of rows, except when the first two rows are compared. Therefore, there is clear evidence indicating that the effectiveness increases as the number of objects per class also increases.  Table 9 shows the effectiveness in dependence of the number of limiting profiles per boundary. There is no statistically significant difference between the average values in Table 9. To a great extent, this is a surprising result since an INTERCLASS-nB model with 5 profiles, although more complex, should ''learn'' better the decision policy that is implicit in the training set.  The average in-sample effectiveness reaches values very close to 1. This proves that the evolutionary algorithm used by the inference procedure behaves quite satisfactorily. The effectiveness is statistically independent of the number of criteria and is a decreasing function with the number of classes. It is also a decreasing function with the number of assignment examples per class, perhaps because increasing this number increases the difficulty of the optimization problem related to the parameter inference. Three limiting profiles in each boundary gives better results than a single profile, but no further increment is necessary.
Since the inferred model will be used to assign new actions, the analysis of the out-of-sample effectiveness is perhaps more relevant. We can establish the following concluding remarks: -The values of the effectiveness are slightly higher than the results reported by [12] for ELECTRE TRI-nB.    Table 10 shows the effectiveness of the approach given different numbers of criteria. The null hypothesis was rejected when comparing all the pairs of rows in Table 10. So, the effectiveness does not vary monotonically with the number of criteria. It is degraded from N = 3 to N = 5 and improved from N = 5 to N = 7. Table 11 shows the in-sample effectiveness depending on the number of classes. Again, a statistically significant difference was found in the comparison of all the pair of rows in    Table 11. The in-sample effectiveness is degraded when the number of classes increases.
The effectiveness of the approach with respect to the number of assignment examples per class is presented in Table 12. Again, the null hypothesis was rejected when comparing all the pair of rows in Table 12. Table 13 shows the effectiveness based on the number of characterizing profiles per class. There is difference between the average values that is statistically significant except when card(R k ) = 3 and card(R k ) = 5. Therefore, the decision analyst could ask the DM to provide three characteristic actions per class, but there is no evidence that increasing this number will provide higher effectiveness.

b: OUT-OF-SAMPLE EFFECTIVENESS
The out-of-sample effectiveness of the approach in the context of the number of criteria is shown in Table 14. The statistical analyses to the values in this table show that there is only difference regarding seven criteria. The increment in effectiveness when going to seven criteria is counterintuitive, although consistent with the in-sample effectiveness provided   by Table 10. This effect is also seen when the effectiveness is broken down in dependence on different numbers of profiles and characterizing objects (see Figures 9 and 10). A more indepth analysis is deferred for future work. Table 15 presents the effectiveness of the approach regarding the number of classes. The statistical tests found a statistically significant difference between all the pair of rows in Table 16. Thus, here, as in the case of INTERCLASS-nB, the effectiveness of the approach is a decreasing function of the number of classes. Table 16 shows the effectiveness based on the number of assignment examples per class. The only pairs for which the difference is not significant are when nclass = 3 and nclass = 5, and when nclass = 9 and nclass = 12. Therefore, the decision analyst can consider requiring for the DM to assign up to nine examples per class. Table 17 shows how going from one to three profiles significatively improves the effectiveness but going from three to five profiles does not.    actions per class is better than a single action but having more than three actions does not seem to be necessary.
The out-of-sample effectiveness reaches values greater than 0.9, clearly greater than INTERCLASS-nB. Its most interesting features are: 1. It shows an increasing dependency on the number of assignment examples per class. This means that the learnability of the methods does not reach a plateau within the range of assignment examples that were tested in our experiments.
2. The effectiveness improves significantly when the number of characteristic actions increases from one to three. From three to five, there is no significant improvement.
3. It is a decreasing function with the number of classes, which is a consequence of a greater difficulty of the assignment problem.
4. The effectiveness seems to increase slightly the number of criteria. This behavior is the opposite to the that observed using the INTERCLASS-nB method. The explanation may be related to the way of defining the measure of effectiveness (see Equation 3). As the number of criteria increases, there could be more incomparability among actions and characteristic subsets R k ; therefore, χ DM ∩ χ inf could be increased, thus producing an improvement of the effectiveness measure in Equation 3.

V. CONCLUSION
This work presents a novel approach to infer the entire set of parameters required to operationalize two recently published multi-criteria sorting methods, INTERCLASS-nB and INTERCLASS-nC. Given a set of assignment examples, a regression-inspired optimization problem is solved by an evolutionary algorithm, which can handle the nonlinear complexity of the interval outranking model; additionally, evolutionary optimization tools are more robust than conventional non-linear programming techniques when the number of parameters, criteria and classes increase. In this way, the cognitive effort required to the DM in the parameter elicitation process is greatly reduced.
Two basic issues should be considered for an appropriate setting of the parameters of a multi-criteria classification model through preference disaggregation analysis: a) The ability to restore the known assignment examples (in-sample effectiveness) b) The ability to suggest new assignments that the DM considers appropriate (out-of sample effectiveness). Most of the related papers focus on point a). Using evolutionary algorithms and assignment examples from simulated decision models, high values of in-sample effectiveness demonstrate that the algorithm finds solutions that are close to optimal. Perhaps the analysis of the out-of-sample effectiveness is even more important; it measures the ability of the method ''to learn'' which assignments the DM considers appropriate, thus being able to suggest appropriate decisions about new actions, that is its real application.
In this paper, the quality of the solutions is characterized by measures of both effectiveness measures. Its dependence on the number of limiting profiles (in INTERCLASS- Point A) coincides with the results reported for ELECTRE TRI-nB by [12]. Point B) is a consequence of an increasing complexity of the optimization problem from which the model's parameters are inferred and an increasing difficulty of the assignment problem. Point C) confirms the premise that more information is usually better than less. Several different behaviors follow: 1. The INTERCLASS-nC out-of-sample effectiveness seems to be higher than that of the INTERCLASS-nB method; however, this comparison is not fair because the definition of effectiveness differs from Equation 2 to Equation 3. The definition in Equation 2 requires an exact coincidence between the DM assignment and the inferred model assignment, whereas the measure in Equation 3 is laxer.
2. As might be expected, the out-of-sample effectiveness of INTERCLASS-nB deteriorates when the number of criteria increases, contrary to INTERCLASS-nC; 3. As expected, the out-of-sample effectiveness of INTERCLASS-nC tends to improve with the number of characteristic actions; on the contrary, this measure seems to be independent of the number of limiting actions in INTERCLASS-nB.
Regarding Point 2, the surprising performance of INTERCLASS-nC could be explained by the increase in incomparability between actions and representative subsets of classes, as was discussed in Section IV, last paragraph.
Convincing quantitative explanations of the above different characteristics should be found by future research works.
The question about which method preforms better in a preference disaggregation context is kept open as another avenue of future research. We should remark that INTERCLASS-nC uses more information than the pseudo-conjunctive INTERCLASS-nB; this comes from two different outranking relations (we mean ''x outranks the representative set R k '' and ''R k outranks x''), whereas the     pseudo-conjunctive INTERCLASS-nB uses only ''x outranks the limiting boundary''. Handling more information could bring a higher learning capacity. A fair comparison should deal with a limiting boundary-based method similar to INTERCLASS-nB, but handling ''x outranks the limiting boundary B k '', and ''B k outranks x'', perhaps with the use of a co-joint assignment rule coming from descending and ascending procedures.

A. IN-SAMPLE EFFECTIVENESS OF INTERCLASS-NB
Results about the effectiveness of the approach regarding the in-sample effectiveness of INTERCLASS-nB are shown in Figures 13 and 14.

B. IN-SAMPLE EFFECTIVENESS OF INTERCLASS-NC
Results about the effectiveness of the approach regarding the in-sample effectiveness of INTERCLASS-nC are shown in Figures 15 and 16.

C. PROCEDURE FOLLOWED BY THE INTERVAL-BASED OUTRANKING APPROACH
Let us assume the notation described in Subsection II.2.
The marginal credibility index of x being at least as good as action y on the jth criterion, α j (x, y), depends on the strength of the arguments provided by such criterion to state that ''x outranks y on this criterion''. On the one hand, if g j ∈ G 1 , then α j (x, y) is defined as: The discordance coalition is defined as C (yPx) = g j ∈ G 1 : g j (y) − g j (x) ≥ p j (·) ; ≤ g j (y) − q j (·) ; and On the other hand, if g j ∈ G 2 , then α j (x, y) is defined as: If we now consider a given credibility threshold γ , then the set of all the criteria for which α j (x, y) ≥ γ is true is called γ -possible concordance coalition and is denoted as C xS γ y . Conversely, all criteria in G/C xS γ y compose the γ -possible discordance coalition, which is denoted as D xS γ y . In order to ensure that there are some realizations of the criteria weights for which N j=1 w j = [1, 1] is true, the following constraints are imposed: The concordance index of xSy, c(x, y) = [c − (x, y), c + (x, y)], is then defined as follows: Otherwise, c + (x, y) is As in the classical outranking approach, its interval-based extension also considers the arguments against the outranking relation through a credibility index of the statement ''the jth criterion vetoes the assertion x outranks y'', which is denoted as d j (x, y) and is defined as follows. For each g j ∈ G 2 , d j (x, y) = p g j (y) ≥ g j (x) + v j , where v j is the interval number representing the veto power of criterion g j .For each g j ∈ G 1 , d j (x, y) can be calculated by one of two ways, depending on the information available about thresholds. First, if the veto power of the jth criterion is precise, that is v j is a real number, and there is a discordance (pre-veto) threshold, u j ≤ v j , then d j (x, y) is (cf. Mousseau and Dias, 2004;Roy and Słowiński, 2008): Second, if the veto power of the jth criterion is imperfectly known, that is, v j is an interval number, then d j (x, y) = p(g j (y) ≥ g j (x) + v j ). Let be the set {α j (x, y) ∈ R : p g j (x) ≥ g j (y) = α j (x, y) , j = 1, · · · , N }. For each γ ∈ , x outranks y with marginal credibility index η γ , and majority strength λ > [0.5, 0.5], (with λ − ≥0.5) if and only if η γ = min γ , p (c (x, y) ≥ λ) , 1 − max g j ∈D(xS γ y) d j (x, y) .
Each η γ is the credibility degree of the conjunction between i) ''the γ -possible concordance coalition is strong enough'' and ii) ''the γ -possible discordance coalition does not exert veto''. η γ is interpreted in [34] as a marginal outranking credibility index. Therefore, x outranks y with credibility index η (x, y) ∈ [0, 1] = max η γ (γ ∈ ). η (x, y) is the interval outranking credibility index. If is an empty set, then η (x, y) is zero. RAYMUNDO DÍAZ was born in Jalisco, Mexico, in 1977. He received the B.S. degree in economía from the Instituto Politécnico Nacional, in 2000, and the M.S. degree in regional economics and the Ph.D. degree in administration and senior management from the Universidad Autónoma de Coahuila, in 2004 and 2017, respectively.
Since 2021, he has been a full-time Professor with the School of Finance and Administration, Tecnológico de Monterrey. He is the author of six articles and one book. His research interests include portfolio management, stock markets, and times series.
ABRIL FLORES was born in Torreón, Coahuila, Mexico, in 1979. She received the B.S. degree in accounting and informatics from the Autonomous University of Coahuila, in 2001, the M.S. degree in economy focused on finance, in 2010, and the Ph.D. degree in strategic management, in 2016. VOLUME 11, 2023