An Approach to the Automatic Comparison of Reference Point-Based Interactive Methods for Multiobjective Optimization

Solving multiobjective optimization problems means finding the best balance among multiple conflicting objectives. This needs preference information from a decision maker who is a domain expert. In interactive methods, the decision maker takes part in an iterative process to learn about the interdependencies and can adjust the preferences. We address the need to compare different interactive multiobjective optimization methods, which is essential when selecting the most suited method for solving a particular problem. We concentrate on a class of interactive methods where a decision maker expresses preference information as reference points, i.e., desirable objective function values. Comparison of interactive methods with human decision makers is not a straightforward process due to cost and reliability issues. The lack of suitable behavioral models hampers creating artificial decision makers for automatic experiments. Few approaches to automating testing have been proposed in the literature; however, none are widely used. As a result, empirical performance studies are scarce for this class of methods despite its popularity among researchers and practitioners. We have developed a new approach to replace a decision maker to automatically compare interactive methods based on reference points or similar preference information. Keeping in mind the lack of suitable human behavioral models, we concentrate on evaluating general performance characteristics. Such an evaluation can partly address the absence of any tests and is appropriate for screening methods before more rigorous testing. We have implemented our approach as a ready-to-use Python module and illustrated it with computational examples.


I. INTRODUCTION
Multiobjective optimization problems represent real-life situations where a decision maker (DM) needs to find the most preferred solution in the presence of several conflicting objective functions. Because of this conflict, a feasible solution optimizing all objective functions simultaneously does not exist. Instead, one can identify so-called Pareto optimal solutions, i.e., feasible solutions, where improving any of the objectives is impossible without impairing at least one of the others. We need additional information for comparing Pareto optimal solutions with each other and choosing the most The associate editor coordinating the review of this manuscript and approving it for publication was Genoveffa Tortora . preferred one. This information, referred to as preference information, can be obtained from the DM.
The class of interactive multiobjective optimization methods ( [16], [22], [25], [31]) has many applications in business and industry (see, e.g., [8], [13], [32], [37]). Their main idea is to iterate between the DM and the method until the most preferred solution has been found. The DM provides some preference information; the method utilizes this information to generate one or few Pareto optimal solutions and presents it/them to the DM. Iterations are repeated until a stopping criterion is met. This scheme has two main advantages. First, generating only a small number of solutions in each iteration limits the computational cost compared to deriving a good representation of the whole Pareto optimal set, which may VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ be impossible in a reasonable time (if the problem is computationally complex). Secondly, the cognitive load set on the DM is low because only a limited amount of information is considered at a time.
In interactive methods, the DM can gradually understand the relation between their preference information and achievable solutions. Observing the DMs' behavior in real-life situations has resulted in distinguishing two phases of interactive solution processes (e.g., [25]). In the learning phase, the DM aims at understanding the problem by exploring different parts of the Pareto optimal set to identify a region of interest. In the decision phase, the search concentrates on this region to find the final solution.
There are many interactive methods for multiobjective optimization (see, e.g., [16], [22], [23], [31]). Testing their strengths and weaknesses and comparing the performance of particular methods is essential for both theory development and practical applications. By design, testing any interactive method requires the involvement of a DM. Some papers describe experiments with human DMs for evaluating different aspects of interactive methods or their components.
In those experiments, one or few test runs for each studied method were performed with multiple (usually dozens of) participants. Comprehensive reviews of such studies can be found in [2] and [28].
Experimental studies of interactive methods involving human DMs are rather fragmented due to several issues. The first one is the high cost of involving human participants in the experiment, limiting the amount of data obtained and the variety of methods and research questions that can be addressed. Secondly, it is even more costly and difficult to hire participants who would be highly qualified experts in the fields where the considered methods are applied. The lack of such participants undermines the results' applicability to real-world cases since non-qualified participants may behave differently during the solution process. In addition, differences in motivation and responsibility of DMs between experimental and real-life settings may affect the results in unpredictable ways. Last but not least, the complexity of human nature, and the variation of human behavior across individuals, time spans, and test environments, make the uniformity of experimental settings hard to control. For example, learning about the problem may significantly change a DM's behavior when using a second method, for which reason test results with the same person when solving similar problems may depend on the order of applying the methods.
A natural way of addressing some of these issues is to simulate the DM's responses to conduct experiments without humans. We refer to this as an artificial DM (ADM). This ADM should provide preference information in a format that is appropriate for the tested methods. Thus, different types of methods may require constructing different ADMs. In this regard, we can classify interactive methods into two types [22], [31]: non-ad hoc methods, where the DM can be easily replaced with a utility or value function, and ad hoc methods, where such a replacement is problematic.
We can test non-ad hoc methods using utility or value functions (see, e.g., references in [3] and [36]), but a large class of ad hoc methods has been left behind, among them the important class of reference point-based methods (see, e.g., [39], [41]). Researchers and practitioners use such methods widely (see, e.g., [23], [43]), and this imbalance is this paper's motivation.
Reference point-based interactive methods utilize preference information provided as desirable aspiration levels forming a reference point and possibly reservation levels forming another reference point (or weights of objectives). Expressing preference information in this way is in line with the concept of ''satisficing'' ( [30], [40]), which is regarded as cognitively undemanding for the DM. However, creating an ADM that expresses preference information in terms of reference points is not a trivial task. Furthermore, unlike utility functions or preference relations, which have been studied from the behavioral perspective, reference point-based preference models lack formal theories connecting them with human psychology.
In some works, the interaction between the DM and the method is simulated for testing purposes (see, e.g., [7], [20], [33]). However, only in a small number of papers simulation algorithms are proposed, which are suitable for testing reference point-based methods. In [34], a reference point in each iteration is set relative to the Pareto optimal objective vector obtained in the previous iteration. The new reference point is obtained using the gradient of a value function. In [35], the reference point is selected among modified objective vectors. The value function is used for selecting one of them. In [44], the reference point is obtained as a solution to a computationally complex problem. In [27], reference points are generated by adjusting aspiration levels in each iteration, considering priorities of objectives, and involving randomness. Reference [4] generates reference points based on particle swarm optimization. In [1], an artificial DM is developed for simultaneous comparison of several interactive reference point based evolutionary multiobjective optimization methods. Different mechanisms are proposed for the learning and the decision phases to generate reference points. Furthermore in [15], reference points are generated to test a specific interactive method for solving a bi-objective inventory routing problem.
So far, none of the mentioned approaches has been applied or further elaborated outside the scope of the article where it was first presented. A possible reason may be the lack of a formal model of the ADM per se. Without such a model, it is hard to tell to what extent test results are related to the actual performance of methods in real-life settings. Because of the difficulties with testing reference point-based methods, publications of new methods never include performance evaluations. All mentioned works on method comparisons were conducted post-factum, in some cases years after the methods had been first proposed.
In this paper, we develop a new ADM to test and compare reference point-based interactive methods without human involvement. We derive the ADM's behavior from some rational assumptions about utilizing information provided by the method. The ADM cannot fully replace human DMs due to the lack of comprehensive behavioral models and difficulties in formalizing the desirable properties of interactive methods [2]. However, it allows evaluating some performance characteristics which are relevant in the decision-making context. Similar to using the Pareto dominance relation for removing inadequate solutions, our ADM can be used as a preliminary filter before the final comparison of interactive methods involving humans.
As mentioned, e.g., in [2], better means for comparing and assessing interactive methods are needed. We address this challenge and summarize our main novel contributions as follows: • We introduce a new ADM for comparing interactive reference point-based methods. Except for the type of preference information, it does not set any other limitations to the nature of the methods compared. Thus, our ADM enables comparing ad hoc methods and, in this respect, fills a gap in the literature.
• We build the ADM on rational assumptions utilizing information that methods compared provide. This enables evaluating how well methods support the exploration of the objective space. • We offer an open Python implementation of the ADM, which makes it easily accessible. With it, a) people publishing new interactive methods can compare the new to old ones, and b) people looking for a suitable method for a problem at hand can filter viable candidate methods. This paper is organized as follows. In the next section, we summarize basic concepts used, including a general scheme of interactive methods. In Section III, we present our ADM together with concrete guidelines for connecting it with a method and conducting tests. We present some exemplary results of computational tests in Section IV and conclude in Section V with a discussion.

II. MULTIOBJECTIVE PROBLEM AND REFERENCE POINT-BASED METHODS
We consider multiobjective optimization problems formulated as follows: with decision vectors x ∈ S ⊆ R n , where S is a nonempty compact set of feasible solutions, k ≥ 2 is the number of objective functions, and f i : S → R, i = 1, . . . , k are the objective functions. For any x ∈ S, the vector f(x) := (f 1 (x), . . . , f k (x)) T ∈ R k consisting of objective (function) values is called an objective vector, R k is referred to as the objective space, and R n as the decision space. The image of S in the objective space is defined by f(S) = {f(x) : x ∈ S}.
Solving the problem means finding a feasible solution, which is the most preferred for the DM -a person or a stakeholder who knows the substance of the problem, can provide preference information, and is responsible for the final solution. The DM compares feasible solutions solely based on their objective vectors, and from the DM's point of view, solving the problem is equivalent to finding the most preferred objective vector in f(S).
Since the DM prefers smaller objective values to larger ones, the search of the most preferred solution is limited to the set of Pareto optimal solutions The objective vector f(x) corresponding to a Pareto optimal solution x is called a Pareto optimal objective vector, and the set of all such objective vectors is referred to as the Pareto front.
Let us also define an ideal objective vector and a nadir objective vector, respectively, as z = (z 1 , . . . , Note that calculating the ideal objective vector is reduced to separate minimization of each objective function in S. The nadir objective vector is more difficult to obtain, so it is usually estimated (see, e.g., [9], [22] and references therein).
Step 0. Initialization of the method. Present some information about the problem to the DM. Step 1. Ask the DM to specify preference information as a reference point.
Step 2. Generate one or several Pareto optimal solutions by minimizing method-specific subproblems in S and present the solution(s) to the DM. Step 3. Ask the DM if one of the previously generated solutions is satisfactory as a final solution to the problem. If yes, stop; otherwise, go to Step 1. Steps 1-3 represent one iteration.
Step 0 is the initialization step, and the requirement to present problem-related information to the DM varies across methods. Some of them approximate the ideal and the nadir objective vectors to give the DM some idea of the ranges of objectives among Pareto optimal solutions. Other methods derive an initial Pareto optimal solution that the DM needs to express preferences in the first iteration. Without loss of generality, we assume that the information presented to the DM in Step 0 includes at least approximations of the ideal and nadir objective vectors.
Reference point-based methods derive Pareto optimal solution(s) in each iteration by solving scalarized problems. A special class of such problems is based on achievement scalarizing functions (ASFs) [40], [41]. There are many variants of ASFs (see, e.g., [39]- [41]) and different ways of expressing preference information in terms of ASF parameters (see, e.g., [19], [21]). All ASFs proposed in the literature can be divided into two types with respect to their number of parameters: either one or two k-dimensional vectors. Both types involve a reference point composed of aspiration levels for k objectives: z asp = z asp 1 , . . . , z asp k . The aspiration levels represent desirable objective function values. ASFs of the second type incorporate another vector of k components: either positive weights of objectives λ 1 , . . . , λ k , or reservation levels forming a second reference point z res Reservation levels represent objective function values the DM prefers to avoid.
We consider the following subproblem involving an ASF: where ρ(x) is a linear augmentation term for ensuring Pareto optimality of any derived solution. 2 If vector z res is provided, then according to [18], [19], weights can be calculated by Another popular approach to specifying weights is by using the formula . . , k, are components of a so-called utopian objective vector, and µ is a negligibly small positive number. Our approach to replace a DM can provide preference information in any form suitable for either of ASF types described above.
To summarize the interaction between the DM and any reference point-based method focusing on the exchange of information, from the method's point of view, the DM is involved at four different occasions: Step 0, the method passes the initial information about the problem, which includes the ideal and the nadir objective vectors, to the DM; • in Step 1, the method receives preference information from the DM; • in Step 2, the method passes to the DM the set of derived Pareto optimal solutions; • in Step 3, the method receives from the DM a signal to stop the solution process. It is natural to assume that the DM saves the information about all previously derived Pareto optimal solutions in a solution pool. This allows the DM to select the most preferred solution from among those derived in the last iteration and from among previously obtained solutions. 3 The ADM is characterized by a choice function, which selects a set of most preferred objective vectors from any given set of Pareto optimal objective vectors. It can be interpreted as a steady, complete model of the ADM's preferences. To make this model operational, we represent it as a value function at the expense of imposing some assumptions. Such a representation is common among other approaches to automated testing of interactive methods ( [3], [34]- [36]).
It is realistic to assume that we cannot directly apply the choice function of the ADM to locate the most preferred objective vector (otherwise, there would be no need to use an interactive method). However, the ADM can apply the choice function for selecting the most preferred objective vector among any finite, explicitly given set. Communication with the interactive method is the only means of obtaining information about the problem. The ADM constructs knowledge about the Pareto optimal set to efficiently utilize this information and uses it to guide the search.
We structure the ADM in three parts and describe them in the following three subsections. The parts are the value function characterizing the ADM's preferences (the steady part), the representation of knowledge about the Pareto optimal set (the current context), and the mechanism for generating preference information based on these two (the preference generator). This structure is borrowed from [27], which fact constitutes the only similarity with the reference.

A. THE STEADY PART
The choice function characterizing an ADM is (see, e.g., [29]): Since it is only applied to choosing from finite sets, we redefine the function domain as the set of all finite subsets of R k without loss of generality: where R = Y ⊂ R k : |Y | < ∞ , Y are sets of objective vectors and |Y | denotes the cardinality of Y .
Let us assume that the choice function satisfies the weak axiom of revealed preferences (or any other equivalent rationality axiom in [29]): In other words, if two elements are both most preferred in some set, then there are no circumstances where one element is preferred over another. As a direct consequence, we have the following statement [29]: Statement 1. There exists a linear order (a complete, transitive and antisymmetric binary relation) on R k , which represents (6): Furthermore, let us assume that the ADM can be applied to solve only a countable set of problems, which implies that the binary relation is defined on a countable set of objective vectors. This technical assumption is required for applying another theoretical result of the rational choice theory [5], stating that any complete and transitive binary relation on a countable set has a value representation. Thus, we have the following statement: Summing up, from reasonable assumptions we obtain Statements 1 and 2, saying that the model of steady preferences initially defined as the choice function C can be represented as a value function v : R k → R :

B. THE CURRENT CONTEXT
By the current context, we mean the knowledge about the Pareto front, which the ADM constructs from information accumulated during the solution process. As said before, the ADM receives the following information from the interactive method: the ideal and nadir objective vectors at the beginning of the solution process and a set of Pareto optimal objective vectors derived at each iteration. This information is included in the current context and used by the ADM for deriving information about the potential location of new Pareto optimal objective vectors that have not been generated yet. We shall now describe this in detail.
Given two objective vectors z and z such that z i ≤ z i for any i = 1, . . . , k, we define a box as the Cartesian product of closed intervals: and call z and z its minimal and maximal points, respectively. Using this notation, we define the initial region as follows: It is the box in the objective space containing the whole Pareto front. The role of the initial region in constructing the current context is to limit the area of the potential location of new Pareto optimal objective vectors.
Let j be an iteration number. After completing the iteration, the solution poolŶ j is defined bŷ where Y l is the set of Pareto optimal objective vectors derived in iteration l. In addition, we setŶ 0 = ∅. Now we introduce a set of objective vectors called the potential region R j : It is easy to see that if a Pareto optimal objective vector derived in the initial region during the iteration j + 1 is new (i.e. it does not belong toŶ j ), then it belongs to R j : otherwise it would either dominate or be dominated by one of the previously derived Pareto optimal solutions. Therefore, information about the potential region is relevant when searching for new Pareto optimal solutions.
We represent the potential region in each iteration as a collection of boxes in the objective space. Once the ideal and nadir objective vectors have been obtained in Step 0, the ADM initializes the potential region as one box, namely the initial region: R 0 = {B}. After having received the set of Pareto optimal objective vectors Y j derived in Step 2 of the j-th iteration, the ADM updates the potential region. It subtracts the set of vectors which dominate or are dominated by the elements of Y j from the area comprised of all the boxes.
Let us define a set of 2 k orthants of the objective space: Algorithm 1 can be used for updating the potential region, where we denote any box included in the potential region by β and the current iteration number by j.
It is important to note that if Y j ⊆Ŷ j−1 (no new Pareto optimal objective vectors have been derived in iteration j, j > 1), the current context does not change according to Algorithm 1. On the other hand, the information about the absence of new solutions should be taken into account by the ADM. To address that, we introduce a small modification to the potential region. It is described at the end of Subsection III-C since its justification refers to an element of the preference generator.

C. THE PREFERENCE GENERATOR
In each iteration j, the search for the most preferred solution is naturally narrowed down to the potential region. We determine a box in R j , where such a solution can most likely be located, and to explore this box, we ask the method to derive Pareto optimal solution(s) in it.
Without complete knowledge about the Pareto optimal set, determining which box is most promising in the above sense is not possible. We propose considering the objective vector in the center of a box (called the midpoint) as a neutral guess about its content and applying the ADM's steady preference model for selecting one among the midpoints of all boxes in R j . Then the ADM instructs the method to derive Pareto optimal solution(s) in the box by setting the reference point at the midpoint of the selected box. We set the vector of reservation levels at the maximal point of the box, reflecting the fact that it represents the upper bounds on components of Pareto optimal objective vectors, which can be possibly derived in this box. We formalize this procedure of generating preference information as follows. VOLUME 9, 2021 Algorithm 1: Updating the Potential Region Initialize R j := R j−1 . For each Pareto optimal objective vector y ∈ Y j \Ŷ j−1 : For each box β ∈ R j : Check the relation between β and the set then β cannot contain any new Pareto optimal objective vectors. In this case, remove β from R j . If β ∩ y + R k < ∪ y − R k < = ∅, then β can still contain new Pareto optimal objective vectors. In this case, keep β in R j . If β ∩ y + R k < ∪ y − R k < is neither empty nor coinciding with β, then a part of β can still contain new Pareto optimal objective vectors, and the rest of β cannot. In this case, modify R j in order to get rid of vectors belonging to β ∩ y + R k < ∪ y − R k < as follows: Split β into a set of boxes, where each box is the intersection of β with one of the 2 k orthants from anchored at y. Include in R j all newly obtained boxes except the following two: • For all boxes in the potential region, the ADM determines their midpoints forming the set where • Using the choice function (5), the ADM selects the subset of the most preferred midpoints: • The ADM chooses any box β whose midpoint belongs to , and generates preference information in terms of aspiration and reservation levels: z asp = Mid(β ), z res = z where z is the maximal point of β . After generating the preference information, the ADM expresses it depending on the requirements of the interactive method: • if the method requires one reference point, then the ADM returns z asp ; • if the method requires two reference points, then the ADM returns z asp and z res ; • if the method requires a reference point and weights, then the ADM returns z asp and the vector of weights (λ 1 , . . . , λ k ) calculated according to (3).
Since the derivation of Pareto optimal solutions is confined to the most interesting box selected by the ADM, we can address the case where Y j ⊆Ŷ j−1 for some j, j > 1, i.e., no new Pareto optimal objective vectors have been derived in iteration j. We slightly amend Algorithm 1 for updating the current context described in Subsection III-B based on the following reasoning. The lack of new Pareto optimal objective vectors implies that the interior of the most interesting box selected by the ADM contains no solutions. Since this box has been explored, we should not repeat an attempt to derive solutions in it. Therefore, after completing each iteration, the box β that was selected by the ADM is removed from the potential region.
We illustrate the process of generating preference information described above in Fig. 1 in the case of k = 2. Let us assume that in some iteration, the solution pool consists of three Pareto optimal objective vectors z 1 , z 2 and z 3 . We show the boxes comprising the potential region in gray color. The dashed curves depict isoquants of the value function representing steady preferences. Small crossed circles in the centers of the boxes mark their midpoints. It is easy to notice that the midpoint of the box between solutions z 2 and z 3 has the highest value of the value function. After the ADM selects this box and generates a reference point in it, the method derives a new Pareto optimal objective vector z 4 . The ADM adds this vector to the solution pool and updates the potential region by removing from it the areas z 4 +R 2 < and z 4 −R 2 < . It divides the box under consideration into four boxes and deletes two of them as described in Subsection III-B.

D. THE SCHEME OF IMPLEMENTING TESTS USING THE ADM
Let us first describe the aspects of interactive methods which can be evaluated using the ADM. The central question is how well the method finds the most preferred solution, referred to as the performance aspect. We can measure it by the following two means, which are dual to each other.
• ''Accuracy'': given the maximum number of iterations, how close to the most preferred solution can a solution be found.
• ''Convergence speed'': given desirable accuracy in the above sense, how many iterations are needed to reach this accuracy.
To evaluate these means for a given problem and an ADM's steady preference model, we must first derive the most preferred solution regarding the ADM's steady preferences. We should introduce a measure of closeness of any Pareto optimal solution and the most preferred solution. We must perform a predefined number of iterations to calculate the accuracy and the minimum proximity between the derived solutions and the most preferred solution. To calculate the convergence speed, we must perform iterations until the proximity of one of the derived solutions falls below the desirable accuracy. The number of iterations serves as the result of calculations. It is practical to set the threshold as the maximal number of iterations to limit the time needed for conducting experiments in the latter case. If the desired accuracy has not been achieved, the convergence speed can be described as ''failed to converge in the allotted number of iterations.'' Another critical aspect is how well the method explores the objective space regarding the accuracy of overall information about the Pareto front. We can quantify this so-called exploration aspect as the hypervolume of the potential region. The smaller is the hypervolume of R j , the less uncertainty the ADM has about the shape of the Pareto front after the i-th iteration.
We propose the following scenario of conducting experiments. We formulate the main research question by the two means describing the performance. After each run, we accompany the performance analysis with the exploration analysis and consider the hypervolume of the potential region after each iteration. The stopping criterion is formulated based on the experiment design. Depending on the performance analysis, the stopping criterion can be, for instance, a certain number of iterations or a certain closeness to the most preferred solution.
Let us now present the scheme of implementing experiments. In testing, the ADM is a black box replacing the DM in the interaction introduced in Section II. During experiments, the ADM and the method are called in turns, exchanging information: the output from the ADM serves as the input for the method and vice versa until a stopping criterion is met. Fig. 2 summarizes how one can use an ADM consisting of the three described parts to test an interactive reference point-based method. Below, we specify the details of the interaction.
• In Step 0, the method passes initial information about the problem (z and z nad ) to the ADM. As described in Subsection III-B, current context is initialized: the potential region R 0 as {β z , z nad } and the solution poolŶ 0 as ∅.
• In Step 1, the method asks for preference information as parameters of the ASF. The preference generator creates such information based on the steady part and the current context as described in Section III-C, and passes this information to the method.
• In Step 2, the method passes to the ADM the set of Pareto optimal solutions Y j derived in iteration j. The ADM uses it to update the current context: it is included in the solution pool and used for updating the potential region as described in Section III-B.
• In Step 3, the method asks for information related to stopping the solution process.

IV. TEST EXPERIMENTS
We have implemented the ADM in Python 3 and published the code under Mozilla Public License 2.0 on GitHub: https://github.com/industrial-optimization-group/ desdeo-adm.
Our purpose is to demonstrate what meaningful information and valuable insights we can gain about interactive methods. We define basic experimental settings and, during the runs, collect information on the performance and exploration aspects introduced in Subsection III-D.
We solve one problem using two popular methods: the reference point method (RPM) [39], [40] and the synchronous NIMBUS method [24], [26]. The latter uses the classification of objectives as preference information indicating how a Pareto optimal solution (called a current solution) should be improved. We can obtain this information from a reference point if the latter does not dominate or is not dominated by the current solution. We show that the ADM can test a broader class of interactive methods by considering a classification-based method.
Below, we briefly summarize the two methods and describe how one can convert a reference point into preference information for NIMBUS. After that, we define the value function representing the steady part of the ADM, formulate an example problem and present the results of experiments.

A. REFERENCE POINT AND NIMBUS METHODS
Both methods fit into the core structure described in Section II. In Step 1 of the RPM, the DM specifies a reference point z asp . In Step 2, the method derives k + 1 Pareto optimal solutions as follows. First, the method solves (2) with the given reference point and fixed weights defined by (4), obtaining a Pareto optimal objective vectorz (the current solution). Then, k more reference points are created by perturbing z asp , where the i-th perturbation, i = 1, . . . , k, consists of adding the value z asp −z to the i-th component of z asp . For each of these reference points, the method solves problem (2). In this way, the DM gets a better idea of the surroundings of the current solution in the objective space. Thus, it is enough to pass the first reference point to the RPM as the preference information.

In
Step 0 of the synchronous NIMBUS method, in addition to the ideal and nadir objective vectors, a Pareto optimal solution is generated as the first current solution. It is needed for expressing preference information in the first step. Since this Pareto optimal objective vector belongs to the solution pool, in terms of the core structure of interactive methods (Section II), it should be treated as a result of an iteration rather than a part of the initialization step. Thus, in our experiments we assume that in the first iteration, the ADM provided the reference point z asp := Mid(B) with the components z nad i +z i 2 , i = 1, . . . , k, to get a neutral compromise solution [24] by solving problem (2).
In Step 1 of NIMBUS, the DM expresses preference information for the current solutionz by classifying the objective functions into up to five classes. This classification shows the desirable way of changing the objective vector component-wise to get a more preferred solution. The classes are subsets of functions f i , i = 1, . . . , k, whose values should be decreased (i ∈ I < ), should be decreased till a desirable level γ i <z i (i ∈ I ≤ ), are satisfactory at the moment (i ∈ I = ), are allowed to increase till an upper bound ε i >z i (i ∈ I ≥ ), and are allowed to change freely (i ∈ I ).
Some of the classes may be empty; however, the following conditions should be satisfied for a classification to be feasible: I < ∪ I ≤ = ∅ and I > ∪ I = ∅. In Step 2, the method derives up to four Pareto optimal solutions by solving four different scalarized subproblems, reflecting the diversity of interpretations of the preference information. We give a simple way of converting reference points into classification following [24] in Appendix I.

B. THE STEADY PART OF THE CONSIDERED ADM
As a value function, we use the Cobb-Douglas function (see, e.g., [14], [38]). It is defined for non-negative objective values to be maximized: . . . , α k ) and α i , i = 1, . . . , k, are non-negative parameters. To use this function, we map the objective space of problem (1) to the objective space of a maximization problem with non-negative objective values: In other words, we normalize each component of the objective vector to the interval from 0 (the worst value) to 1 (the best value), and add a small term to avoid the marginal rates of substitution diminishing to zero. The value function representing the steady ADM's preferences is where g(z) = (g 1 (z 1 ), . . . , g k (z k )).

C. EXAMPLE PROBLEM
We tested the two interactive methods on a three-objective optimization problem. This number of objectives ensures that both the RPM and NIMBUS solve the same number of subproblems in each iteration. We considered the following modification of the problem formulated in [12]: where For this problem, we have z = (−1, −1, −1) and z nad = (1, 2, 2). We have chosen this particular problem because it has a nonconvex Pareto front of a complex shape, making the solution process more challenging.
It is a nonlinear problem. The scalarized subproblems were solved using the simplicial homology global optimization (SHGO) algorithm [11] published in Github https://github.com/Stefan-Endres/shgo and available as a part of the SciPy library. It is well suited for problems of this type and has an easy-to-use interface.

D. EXPERIMENT DESIGN
We characterize an instance of the ADM by a vectorᾱ = (α 1 , α 2 , α 3 ) of parameters of the value function v defined by (10), which represents the steady part. We used each instance for solving the example problem independently by both considered methods. We performed 100 experiments with randomly generatedᾱ. We ran the solution process with each instance for up to 25 iterations, as this number exceeds the typical duration of real-life solution processes reported in the literature.
At the end of each iteration i, i > 0, after the ADM updated the current context taking the derived solutions into account, we collected the following indicators characterizing the solution process.
1) The iteration value defined as v i v · 100%, where v i is the maximum value of the value function v among the solutions derived in iteration i, and v is the optimum value of the value function for problem (11). This indicator reflects the quality of obtained solutions and can be interpreted in terms of the proximity of the best solution among the obtained ones to the most preferred solution.
2) The volume of the potential region, which characterizes the exploration aspect.
3) The total number of Pareto optimal solutions accumulated in the solution pool. As explained later, we consider this indicator for controlling the fairness of method comparisons. Fig. 3 illustrates the results of running the solution process for 25 iterations using both methods with the ADM characterized byᾱ = (1, 1, 1). It includes the plots of the iteration value (black lines, the left-hand axis) and volume of the potential region (gray lines, the right-hand axis) along with iterations. The indicators of the RPM and NIMBUS are depicted by solid and dashed lines, respectively. The small rhombuses mark the maximum iteration values among 25 iterations for both methods.
For the RPM, 99.2% of the optimum was achieved in the 21st iteration, and in the case of NIMBUS, 99.7% of the optimum was achieved in the 18th iteration. The ADM demonstrated the same pattern for both methods: the iteration value gradually stabilized near the highest value while the potential region volume decreased. This observation indicates that an improvement in the quality of the derived solutions goes hand in hand with the accumulation of knowledge about the Pareto front.

E. RESULTS OF EXPERIMENTS
In each of the 100 experiments, we generated the coefficients α 1 , α 2 , and α 3 as independent random variables uniformly distributed in the interval [1,2]. During the test of each of the methods with the same ADM, we calculated the following indicators.
1) The maximum among iteration values achieved during the first ten iterations. This characterized accuracy, where we interpreted the obtained value in terms of the proximity to the most preferred solution. We chose ten iterations as a specific number of iterations to be completed, taking into account our experience with real-life interactive solution processes.
2) The smallest iteration number where the iteration value of 95% was reached, referred to as the iteration number of 95% level. This indicator characterized the convergence speed. Not getting a 95% level in 25 iterations was considered a failure.
3) The volume of the potential region after ten iterations characterizing the exploration power of the method. 4) The size of the solution pool after ten iterations used for controlling the fairness of experiments.  in parentheses for each of the mentioned indicators. The last column summarizes the differences in the indicators values between the RPM and NIMBUS in each experiment. On average, the RPM performed similarly to NIMBUS in terms of accuracy 5 and 21.6% faster in terms of convergence speed. The RPM also reduced the average potential region volume after the 10th iteration to 7.12. It is 10.1% less than what was done by NIMBUS (note that the volume of the initial region is equal to 18). We can view this as a better exploration of the objective space.
On the other hand, the RPM provided the ADM with, on average, 2.9 (8.6%) more solutions in 10 iterations, which can be partially explained by the special treatment of the first iteration in NIMBUS (where only one solution was derived, see Subsection IV-A). This advantage partially explains the better performance of the RPM.
Another important observation from the experiments was that the differences between the methods' performance varied significantly across ADM instances. For the maximum iteration value and the iteration number of 95% level (see the first two cells of the third column), the standard deviation of the difference was 5.7 and 2.5 times higher than the absolute average difference, respectively. In addition, in all the experiments where the RPM failed to achieve 95% iteration value, the NIMBUS method succeeded, and vice versa. In other words, the differences in those methods' performance are sensitive to the parameters of the ADM's steady preference model.
The above observation sheds light on the issue of selecting an interactive method. It appears that even if a problem instance and a class of steady preference models are given, there is not enough information to conclude which method is better. The recommendation on method selection may depend on the parameters of the preference model. This dependence undermines the possibility of universal recommendations, suggesting that it may be necessary to select an interactive method for each DM.
One can look for patterns in the dependence between the preference model parameters and methods' performance by conducting experiments, where parameters change systematically. As an example, we set 6 the value of α 3 to 1, and varied α 1 and α 2 independently from 1.5 to 2.5 with step size 0.2. For each of the 36 resulting parameter vectors, we ran the solution process with both the RPM and NIMBUS in the same way as in the former series of experiments. The tables in Appendix II present the main performance indicators depending on the parameter values: the maximum iteration value achieved in the first ten iterations (accuracy) and the iteration number, where a 95% iteration value has been achieved (convergence speed). We can notice that better performance of both methods in terms of both indicators was achieved for the lowest value of α 1 as well as low ratios between α 1 and α 2 . However, the combinations of parameter values where each method demonstrated good performance did not completely coincide. In terms of accuracy, good performance of the RPM was also achieved for some combinations where α 2 ∈ {0.7, 0.9}, and good performance of the NIMBUS method was also achieved for α 1 = 1.5 and 0.7 ≤ α 2 ≤ 1.3. Interestingly, for the combinations α 1 ∈ {1.3, 1.5} and α 2 = 0.5, despite the RPM had better performance in terms of accuracy, it had a bad performance in terms of convergence speed as the iteration value of 95% was not reached in 25 iterations. Summing up, the influence of parameters of the value function on methods' performance does not look arbitrary but demonstrates some patterns. The analysis of such patterns could be utilized for predicting methods' performance based on parameter values.
Let us note that we must judge the fairness of comparing methods individually for various experimental settings. For example, one must decide whether it is fair to conduct the same number of iterations (knowing that different methods may need different amounts of computations per iteration). Alternatively, one can apply the same number of function evaluations or require a similar amount of information from the ADM. For example, due to the special treatment of the initialization phase, the NIMBUS method derived fewer solutions in the first iteration than the RPM. We can consider the amount of cognitive load set on the DM as a ''common denominator'' for establishing the equality of experimental settings for human DMs. In the case of ADMs, we can estimate corresponding information by the size of the solution pool.

V. CONCLUSIONS
The concept of reference points is widely used for representing preference information in interactive multiobjective optimization methods. The class of reference point-based methods is important from theoretical and practical points of view but lacks comparative studies. We have proposed an approach called an ADM to enable comparing interactive reference point-based methods. One can apply it for preliminary comparison of methods of this class before involving a human DM. The transparent description of the principles underlying the ADM's operation and a ready-to-use implementation in a widely used language (Python) should make our development easy and attractive to apply.
We demonstrated how one can apply the ADM with some preliminary experiments and that it can exhibit complex operations and help draw nontrivial conclusions. For example, we were able to quantify how sensitive differences are between methods' performance to parameters of the ADM. It appears that the recommendations on method selection may require adjustment in each case. We can establish the fairness of method comparison in terms of the number of solutions provided to the ADM in the same number of iterations.
The current implementation of our approach is not suitable for comparing interactive methods where Pareto optimal solutions are approximated. If different elements of the solution pool can dominate each other, then the premise that any newly derived objective vector belongs to the potential region is violated. Algorithm 1 of updating the potential region should be revised to adapt our approach for comparing approximate methods. In the long run, it is desirable to compare methods from different classes (using different preference types), employing other ADM approaches calibrated to ensure fairness of comparing methods.
As said in the introduction, we have not intended the ADM to replace human DMs in tests of interactive methods fully. Due to the broad spectrum of human behaviors, evaluating the accuracy of the ADM should be done in the framework of a multidisciplinary study involving many experiments with real DMs in various settings. That would pave the way for further improvement of the proposed approach.

APPENDIX I GENERATING NIMBUS-RELATED PREFERENCE INFORMATION FROM A GIVEN REFERENCE POINT z asp
First, it is natural to select the current solution in iteration j from among those members of the solution pool which are  the most preferred (according to the steady part): z ∈ C(Ŷ j ).
Secondly, each objective function is assigned to one of the five classes by consecutively applying the following rules (once an assignment is done, the remaining rules do not apply): • if z asp i is close to z i , then assign i to I < , • if z asp i is close to z nad i , then assign i to I , to z i or z nad i mentioned in the first two rules means that the absolute difference between z asp i and the corresponding value is less than 1% of the difference z nad i − z i . Observe that due to the method of defining preference information (see Subsection III-C), a reference point z asp generated by the ADM always belongs to the interior of the potential region. It follows that z asp never dominates nor is dominated by any Pareto optimal objective vector. In this case, it is easy to see that the classification obtained as above is always feasible. In both Tables 2 and 3, the row and column headers represent the values of α 1 and α 2 , respectively. Each cell contains two values of the performance indicator obtained in the experiment with the corresponding parameter vector (α 1 , α 2 , 1): the value for the RPM above the value for the NIMBUS method.

APPENDIX II INDICATORS OF METHOD PERFORMANCE DEPENDING ON ADM PARAMETERS
DMITRY PODKOPAEV received the Ph.D. degree in mathematics and physics from the Institute of Mathematics, National Academy of Sciences of Belarus. He currently works as an Assistant Professor with the System Research Institute, Polish Academy of Sciences. His research interest includes multiobjective optimization as a methodology for decision making in complex systems. He develops optimization methods, preference models, and interactive decision support systems, and has applied multiobjective optimization methods in various application domains, such as medicine, finance, computer systems, and supply chain management. He has the title of Docent in multiobjective optimization from the University of Jyväskylä, Finland.
KAISA MIETTINEN received the Ph.D. degree in mathematical information technology from the University of Jyväskylä (JYU), Finland.
She has worked with IIASA, Austria; the KTH Royal Institute of Technology, Stockholm, Sweden; and the Helsinki School of Economics, Finland. She is currently a Professor of industrial optimization with JYU. She also heads the Research Group on Multiobjective Optimization and is the Director of the thematic research area ''Decision Analytics Utilizing Causal Models and Multiobjective Optimization'' at JYU. She has authored about 190 refereed journals, proceedings, and collection papers; edited 17 proceedings, collections, and special issues; and written a monograph on nonlinear multiobjective optimization. Her research interests include theory, methods, applications, and software of nonlinear multiobjective optimization. She is a member of the Finnish Academy of Science and Letters, Section of Science, and has been the President of the International Society on Multiple Criteria Decision Making (MCDM). She has received the Georg Cantor Award of the International Society on MCDM for an independent inquiry in developing innovative ideas in the theory and methodology of MCDM. She belongs to the Editorial Board of seven international journals and the steering committee of Evolutionary Multi-Criterion Optimization.