A Metaheuristic Search Algorithm Based on Sampling and Clustering

As optimization problems become more complicated and extensive, parameterization becomes complex, resulting in a difficult task requiring significant amounts of time and resources. In this paper we propose a heuristic search algorithm we call MCSA (Montecarlo-Clustering Search Algorithm), which is based on Montecarlo sampling, and a clustering strategy involving two techniques. Our objective is to apply MCSA, an inherently stochastic method, to address optimization problems. To assess its performance, we conducted an evaluation using classical benchmark optimization functions. Additionally, we leveraged the CEC2017 benchmark suite to comprehensively evaluate the algorithm, highlighting the pivotal role of the Exploration stage in our methodology. Subsequently, we extended our methodology to tackle a practical combinatorial problem, the Knapsack problem. This NP-Hard problem holds significant real-world applications in resource allocation, scheduling, planning, logistics, and more. Our contributions lie in parameterizing the Knapsack Problem to align with MCSA’s parameters for reference indicators adjustment and achieving high-quality solutions, surpassing 90% in comparison to exhaustive methods such as branch and bound.


I. INTRODUCTION
Recently, optimization problems are becoming more complicated and extensive, hence the inability of accurate optimization methods to solve complex and multidimensional problems.Therefore, approximation algorithms have been proposed as a new approach to solving such problems.Each type of optimization has some challenges.Single-objective optimization, however, can be included in many other types of optimization, like constrained optimization, multi-modal optimization, multi-objective optimization, etc. [1], [2].Regarding the dimensionality of an optimization problem, it is challenging, since the search space becomes enormous when the dimensions of a given problem in-crease [3].The domains of the variables determine a search space (lower and upper bounds).However, due to the complexity of continuous The associate editor coordinating the review of this manuscript and approving it for publication was Utku Kose .optimization problems, that cover a wide range of different issues, even mathematical methods (used to solve scientific and engineering problems) have had difficulties in regards to the extensive calculation needed [4].Therefore, the use of meta-heuristic algorithms has grown, and now they are a reliable tool for solving various search problems, both continuous and discrete.
Meta-heuristic algorithms benefit from the Exploration process, since they can produce new solutions while visiting the search space.Gradually, the algorithms transit to Exploitation, focusing on the accuracy improvement of the solutions obtained in the Exploration phase.On the other hand, the Exploitation process generally produces a new solution based on the best solution available [5], [6], [7], [8].As a result, the challenge that meta-heuristic algorithms face lies in using essential Exploration and Exploitation processes to avoid getting trapped in the local best solution and converging towards the target [9], [10].Overview of MCSA heuristic method [44].
In this work, we present a heuristic methodology that features the Montecarlo Clustering Search Algorithm (MCSA) to solve optimization problems.MCSA consists on two phases: Exploration and Exploitation, as seen in Figure 1.Exploration is based on random uniform sampling by Montecarlo techniques, and in Exploitation, we perform an unsupervised clustering strategy to analyze density, allowing for a centroid-based clustering afterwards via K-means [11].The clustering process, at the heart of MCSA's Exploitation phase, accomplishes two key objectives: it effectively reduces the search space and efficiently identifies feasible regions containing optimal values.Furthermore, MCSA conducts an iterative search process.Once it identifies a K-cluster (or multiple K-clusters) containing promising solutions, it initiates a subsequent exploration phase.This phase revolves around these viable regions, where new random samples are generated and leveraged to enhance the solution, bringing it closer to the optimum.
In order to test the scope of the methodology, we set out to solve a combinatorial optimization problem.To this extent, the Knapsack Problem is part of a historical list of NP-Complete problems elaborated by Richard Karp [12], and it also appeared in [13].The binary Knapsack Problem (KP01) has been treated with a multitude of methods, some more efficient than others in terms of quality of the solution (such as greedy algorithms).As we have reviewed, a problem is classified as inherently difficult if its solution requires onerous computational resources, regardless of the algorithm used.This has also been carried out through heuristic methods [14], sometimes trading solution quality for a reduction of the search time, or sometimes seeking a reduction of the search space [15].
Therefore, we present a novel approach to tackle KP01.Our strategy involves organizing the input data-set within a discretized search space, structured in up to n − dimensions.We set the problem size as being equal to the power-set of the n − elements.The core aspect of our approach lies in the distribution of elements along with their combinations, across the n − dimensions.Consequently, as the number of dimensions increases, the number of elements to be arranged within each dimension decreases.Notably, our approach has demonstrated solution qualities exceeding 90% on average when addressing instances containing up to 2000 elements achieving near-optimal solutions.
Since we are aware that quality evaluation is important, MCSA was evaluated through the benchmark functions for optimization [16], [17], [18], achieving promising results, optimal in 60% of the test cases, and near optimal in the remaining 40% given the complexity of the chosen functions.
The paper is organized as follows: Section II describes the Related Works, introducing meta-heuristic algorithms.Section III explains the Background of MCSA Methodology, and its evolution into a more robust algorithm along with the Evaluation tests that we carried out with benchmark functions.Section IV presents how we adapted the method to solve the Knapsack Problem, and brought its input parameters to-wards the parameters of MCSA.Also in this Section, we provide a proposal to solve the multi-objective variant of the Knapsack Problem with our MCSA method.We discuss the results in detail on Section V, and finally present the conclusions in Section VI.

II. RELATED WORKS
As optimization problems become more complicated, optimization algorithms play an essential role in this transformation, which usually at-tempts to characterize the type of search strategy through an improvement on simple local search algorithms.However, in cases where the search space is enormous, an exhaustive search, iterative methods, or simple heuristics are impractical.In contrast, meta-heuristic ideas [19], some-times classified as global search algorithms, can often find reasonable solutions with less computational effort.
Probabilistic distribution methods, such as Montecarlo, offer flexible forms of approximation with some advantages regarding cost.There are useful statistical techniques available, but the Montecarlo methods use randomness to solve problems, which is useful when it becomes difficult to apply other approaches.The algorithms based on random searches are often globally convergent.Thus, we find the probabilistic methods such as the stochastic method [20], Andradottir's random search methods [21], the stochastic comparison method [22], and nested partitions with specific implementations as in [23], among other algorithms.Any globally convergent algorithm requires examining each feasible solution often to ensure convergence, because, a part of an optimization algorithm uses the algorithm's information to help decide which solution between the candidates should be tested.
In this sense, other approaches use similarity or meta-heuristic algorithms to solve high-dimensional optimization problems, which are validated using large-scale functions [24].However, they are prone to fall into local optimum values.To solve global optimization problems, the most recognized approaches that make use of global exploration are meta-heuristic algorithms.For example, many researchers have developed evolutionary-type algorithms [25], since they provide a harmonious balance between Exploration and Exploitation [26], [27], [28].
Evolutionary algorithms are inspired by natural biological evolution, such as reproduction, mutation, recombination, and selection.Some examples are the Genetic Algorithm [13], which has been used with the most essential combinatorial and mutational operators and is considered a successful algorithm, accounting for its many variations.In the last decade, many optimization algorithms have been proposed, generally consisting of swarm-based, physics-based, genetics, etc [29].Evolution Strategy [30] or Evolutionary Programming [31] are other common naturally inspired approaches to global optimization problems.Further details regarding the vast state of the art on optimization algorithms are reviewed in [32] and [33].
While multiple approaches develop continuously, we found that, among the combinatorial optimization methods, as Cabrera et al. [34] argument, the efficiency gains regarding the application of sampling and grouping techniques should be explored to solve problems of a complex nature, that sometimes involve dynamic and strongly human-dependent systems.
Cluster analysis algorithms are a key element of exploratory data analysis.In a broad sense, clustering algorithms can be categorized into partition-based, hierarchybased, density-based, and grid-based methods.Notably, the most widely used clustering algorithms are K-means [35] and DBSCAN [36], [37], [38], the latter being a popular clustering algorithm rooted in density-based principles.It identifies clusters by detecting high-density areas separated by low-density areas relying on the concept of cluster density.DBSCAN is suitable for finding clusters of any shape in a spatial database and connecting adjacent regions with corresponding density.Its robustness extends to handling outlier data, making it particularly valuable for spatial data clustering [39].

III. MCSA HEURISTIC METHODOLOGY
In this section, we present MCSA which is based on random sampling by Montecarlo techniques and a set of clustering strategies [44].
The method was initially developed to solve a particular type of optimization problem that featured a double-objective scheme [40], and whose solution was ultimately within the limits of the Pareto optimal.Solving a multi-objective optimization problem is approximating a representative set of Pareto optimal solutions.Cabrera et al. approach was an interesting starting point because two-objective optimization brings out the essential features of multi-objective optimization that pursued the best scenario to manage a Medical Emergency Department, and their methodology was validated through an exhaustive method, or using brute force, which helped to prove the results' quality when using a combined heuristic.
We have devised an enhanced methodology that builds upon the Montecarlo sampling approach.This methodology incorporates a novel clustering strategy that synergizes two techniques to conduct a more precise search, con-sequently reducing the search space.The resulting regions are meticulously analyzed as sample concentrations, offering valuable insights that can lead us to the optimal solution or a highly promising one.
In Figure 2, we show MCSA flowchart.The base method consists of two phases.The first phase, Exploration, includes a coarse-grained approach, which carries out a global exploration of the discrete ordered n − dimensional search space to find promising regions identified on the neighborhood structure of the problem.This phase uses Montecarlo-based heuristics, returning a collection of fit samples or a set of in-dependent and homogeneously distributed variables that can be represented in the form of a map or in a landscape.At this stage, one of our main objectives is to obtain a uniform sample as homogeneous as possible.Thus, we ensure that the search process will be even.
Regarding the convergence criterion, we employ a fitness function that relies on calculating the mean of each Montecarlo sample set with a fixed precision.These samples ultimately form the solution map.This map comprises all the fit samples that meet specific criteria or restrictions.
At this juncture, the Exploration Phase of the methodology concludes, marking the gradual transition into the Exploitation Phase.To facilitate this transition, we introduce a crucial step involving map reduction.This entails making a cut procedure along the objective function Axis, thereby allowing us to seamlessly transition into the subsequent clustering analysis.This analysis constitutes the primary focus of the Exploitation phase.
The iterative behavior of MCSA, allows to re-explore feasible regions, improving the solution, and contributing to FIGURE 2. Flowchart of the MCSA heuristic methodology [44].
progress while approaching to the global minimum, hence increasing the quality of the results.With high accuracy, the user can expect to obtain the value closest to the optimum.In addition, it may contribute to save both time and human resources, considering the enormous amount of data to be calculated and resources that sometimes are needed to reach a solution [40].

A. EXPLORATION PHASE
Montecarlo techniques initiate by generating random samples that uniformly cover the discretized search space, creating a representative sample set for the problem to be evaluated within the objective function f (x).The fitness function converges with a fixed precision of E − 02 based on the average of previous values.This convergence criterion necessitates at least two iterations of the Montecarlo process for comparison.Once an initial sample is obtained, it can be visualized as a logical map, particularly evident in plots when dealing with low-dimensional problems.For instance, in an optimization problem with just one decision variable, the resulting fitness landscape typically takes the form of a nonlinear line.Similarly, for problems with two decision variables, the fitness landscape becomes a two-dimensional surface.In contrast, for higher-dimensional problems featuring three or more decision variables, the fitness landscape becomes a hyperplane, which poses challenges for easy visualization.We want to emphasize that we have imposed a strict upper limit on the percentage of random samples generated, which is fixed at 15%.Our analysis confirms that this limit was never surpassed in any scenario.
In the quest for the globally optimal solution, success depends on factors such as the size of the area to be searched, the landscape smoothness, and the behavior of the search method.In the context of optimization problems, we define the objective function as f (x), representing a real-valued function linked to one or more decision variables, aiming for maximization or minimization.It is worth noting these aspects because the search strategy we propose is capable of analyzing wider regions and cross valleys in the fitness landscape to achieve better results.Clearly, the methodology is scalable to problems of larger dimensions.
After the initial execution, MCSA enters a series of iterations aimed at refining the solution.During these iterations, it returns to the Exploration phase, with a focus on feasible regions delineated by clusters containing the most favorable values.The purpose of these iterations is to achieve increasingly accurate results.However, the algorithm limits the number of iterations, and the stopping criterion is determined by variations in solution values.If these variations become negligible, the algorithm considers the current solution as the best possible.

B. EXPLOITATION PHASE
In this phase, we delve into an investigation of the algorithm's efficiency.The transition into this phase is gradual and occurs after the cutting process has been executed.As previously mentioned, one of the challenges encountered during the adaptation of the density clustering method to the MCSA methodology was related to the expanding problem size.To successfully complete our two-step Exploitation process using clustering methods, we cannot initiate the search for densities directly on the initial Montecarlo map.Directly applying DBSCAN, for instance, would result in a single cluster, necessitating an analysis of the entire sample.
Hence, it is advantageous to reduce the sample map, a task achieved by cutting the sample along the objective function f (x) Axis.
To provide a clearer insight into the cutting process, we present a Flowchart in Figure 3, where we have decomposed the steps that would lead us to implement this cutting proposal on the sample map in order to reduce it and enhance the analysis and clustering of the data.It also details the procedure for cutting through the sample map and eliminating irrelevant samples.
Initially, we determine the lower and upper bounds based on the values of f (x) obtained from the resulting samples (Distance and Range in the flowchart).This division results in ten fixed sections (MaxSteps = 10), each representing a heuristic 'Step' based on f (x).We then commence the sweep from the lower bound of each section until one of two stopping criteria is met.Either the heuristic 'Step' is successfully completed, or we identify multiple regions with a high concentration of samples.
In the first scenario, after completing a heuristic step (initial cut proposal), we calculate the average of the values within the resulting sample concentration to assess the potential for grouping concentrations into clusters.These calculations are updated at each step (Step + +).
If, during one step, more than one cluster is identified (CurrNumClusters > = PrevNumClusters), and in subsequent steps (Step < = 10), the number of clusters either remains unchanged or decreases (PrevNumClusters = Cur-rNumClusters), the alternative stopping criterion is triggered.This leads to a return to the previous step, where the algorithm divides the data into two separate regions that can be treated as distinct clusters.This approach enables a more in-depth exploration of regions with potential values.
We say we are entering the Exploitation Phase once that the resultant areas (with concentrations of samples) are delimited.As depicted in Figure 4, systematically sweeping the data along the f (x) Axis starting from its lower bound (the minimum sample value) allows for the exclusion of samples that do not need to be analyzed.
The same Figure provides information about the number of potential clusters discovered at each step of the heuristic process.The Y-axis corresponds to the range of function values, while the X-axis represents the function's domain.Furthermore, the arrow in the Figure signifies the direction of this sweeping operation, involving a partial analysis of the sample.Lastly, the dashed line denotes the section where the proposed cut will be executed.
After this step, the Montecarlo map is clustered to pinpoint regions predominantly characterized by sample concentrations, with adherence to DBSCAN-configured parameters, such as the minimum number of samples per cluster and the inter-sample distance.In this specific case, two clusters will be identified and formed.
The second clustering process utilizes K-means.This particular sequencing enables us to discern feasible regions with greater precision.Additionally, it addresses the limitations we encountered with distance criteria, upon which many centroid-based clustering methods, including K-means, rely.These criteria did not facilitate an accurate analysis of the Monte Carlo map for our specific objectives.Consequently, we have devised this dual clustering strategy that incorporates a density analysis.In this scenario, two distinct regions were identified and subsequently selected as the final clusters.At this point, the remaining portion of the sample is discarded, and we proceed to initiate the k-means clustering process.The clusters are formed around four centroids to classify the values based on their distance to the centroid.
This allows for the selection of the cluster (or clusters) containing the most promising values for further analysis.The initial selection often provides a solution, which in some cases can be deemed feasible.In other instances, it may be necessary to iterate a specific number of times to refine the solution.Consequently, conducting further exploration within the bounded regions and exploiting the new values becomes crucial for enhancing the solution.In cases where multiple K-clusters contain numerous identical values that can be considered feasible solutions, one might infer that the problem resembles a function with multiple global minima.
Thus far, the proposed MCSA methodology enables us to operate within reduced regions of the discrete search space.This is achieved through the combined clustering techniques that constitute a successful Exploitation phase.Moreover, the iterative nature of MCSA helps prevent being trapped in local minima, thereby offering the opportunity to discover better values compared to the ones found initially.

C. QUALITY EVALUATION OF MCSA HEURISTIC METHOD
To evaluate the heuristic MCSA thoroughly, we selected a set of different benchmarks, seeking further for more complicated multi-dimensional functions to increase the difficulty in optimization and stress the algorithm.A balance between the number of samples needed to reach a solution, against the problem size is necessary to perform the evaluation.
The experiments were conducted using classical optimization benchmark functions whose number of objectives is known.Among the most common limitations of these functions is that some of them are defined with respect to only one or two parameters.Others are also non-separable, or nonlinear functions, etc. Regarding the benchmarks, as listed in Table 1, they are seen as minimization problems in our matter.The quality of the solutions are overall above 97% with respect to the optimal, which is reported in the Objective Function Value f (x) column.As evident from our observations, the ratio between the number of samples required to attain a solution and the problem size is exceptionally low.In the majority of instances we were able to achieve high-quality solutions, and in some cases, even the optimal solution.Nevertheless, out of all the cases we tested, only one in which reached the limit of 15% of samples regarding the problem size.This specific case pertains to the XinSheYang function, where the solution quality reached a remarkably 100%.
In addition to these tests, we have evaluated MCSA using the CEC17 benchmark problems for single-objective, realparameter numerical optimization [10], which has helped us under-stand how to compare the performance of different meta-heuristics, as appears in [41], [42], and [43].Table 2 reports the tested benchmark problems based on shifted, rotated, non-separable, highly ill-conditioned, and complex optimization benchmark functions and show the best results obtained from an average of 200 executions.Most of the functions we tested in two and ten dimensions and in an ample domain.
As seen in the boxplots in Figure 6 and Figure 7, the results obtained by MCSA were relevant overall.While in some functions, the optimum was found, in other functions promising results were achieved, indicating that the result values are more or less symmetrical.The difficulty of the functions is coupled with the size of the search space which scales exponentially when the number of dimensions is increased.Therefore, as the problem size grows exponentially, the possibilities of obtaining a homogeneously distributed random sample is incredibly difficult, causing the fitness function values to converge too soon, giving rise to inefficiencies and low opportunities of finding promising solutions.
Based on these results, we believe there is potential for enhancement.As a result, we are eager to pursue further testing with the aim of exploring opportunities to reduce the search space.This will enhance the algorithm's exploration processes, particularly when dealing with substantial problem sizes.

IV. SOLVING THE KNAPSACK PROBLEM THROUGH MCSA HEURISTIC METHOD
The Knapsack Problem (KP) is considered a combinatorial optimization problem, because it fulfills a series of properties that also make it particularly suitable to study.Binary KP or KP01 is an important modality of the classical Knapsack Problem and one of the most intensively analyzed discrete programming problems.
The reason for such interest derives from the fact that it may represent many practical situations.For example, in cutting stock, where you have to cut a steel plate into different pieces, or you have to determine the items that a warehouse can store to maximize its total value or maximize the benefit in investment allocation when there is only one restriction.In addition to this, in situations where there are load distribution problems (physical, electrical, etc.) or in the allocation of processors and data in distributed systems.
The Knapsack Problem also turns out to be an interesting choice to evaluate our MCSA methodology because of its intuitive and easy-to-understand statement and its importance among real-life decision-making processes in various fields.In addition, adapting this type of problem to our methodology adds another step.The previous work needed to adapt the parameters of the problem into the parameters of MCSA required to develop the strategy that included the decomposition of the problem in a mathematical formulation, comprising variables and parameters.
Our aim to solve the Knapsack Problem is to find a quality solution, while making an efficient use of the resources, especially regarding compute, which is directly related to the Exploration processes that involve sampling methods like MCSA, designed to re-explore feasible regions to improve the solution by performing iterations.As we know, the solution is contained within some combination of elements, for which, we must know the power-set, based on the number of elements.Thus, any combination within this set is a possible solution.To deal with this problem, we distribute the elements and its power-set along the search space that can be n − dimensional and perform an ordering process.This type of arranging, does take into consideration all the possible combinations of elements.
The Knapsack Problem has the following statement: Given a set of items, we determine the number of items included in the Knapsack so that the total weight is less than or equal to a given capacity.Each item has a weight and provides a benefit (or profit), regarding the problem itself.The objective is for a supposed person to choose the elements that will allow him to maximize the benefit without exceeding the allowed capacity of the Knapsack.
Mathematically, we have: where W denotes the maximum capacity of the backpack, x would be the elements whose index numbering can vary from 1 to n. Concerning w i and p i , they represent the weight and value of element number 1, meaning that the sum of the weights, or the function W(x) of the objects that we put in the backpack must not exceed its capacity, which is W. Regarding the restriction, it is given by: where Z(x) is the objective function (to maximize or minimize).In addition, a vector x that meets the restriction W in the formula is feasible if we have a maximum result in Z(x).Then X is optimal, and other constraints can be added depending on the case, being able to find singular cases.One of the essential steps in our methodology is to analyze how to adapt any problems data to the parameters of MCSA.Find the limitation rules, and how to sort data to be represented through a Montecarlo sample map.If we make a good parameterization of any problem, then the results will be better interpreted, and will enable us to find the relationship between the variables and be able to reuse what was previously learned.
Figure 8 shows a graphic representation of MCSA methodology applied to the Knapsack Problem.The Exploration phase comprises the sampling process, which is made throughout the search space.The ranges of such space are bounded by the calculus of the power-set.This is, 2 6  = 64 Suppose we arrange a 2 − Dimensional search space.How do we distribute the elements?
To answer this question, we should keep in mind the cardinality of a set.If we distribute evenly, then we will put 3 elements per dimension, this is |S| = 3.Therefore, its power-set is 2 3 , which is equal to 8 possible combinations.As seen in the same Figure 8, for this instance, the problem size remains equal to 64, due to the correspondence between the sets.
To enable identification of any element or a subset, a binary ID is given to every combination of elements in each  dimension, and its values 0 and 1 indicate whether an element enters the Knapsack or not.Thus, it is possible to know exactly which elements make up each combination, making it easier to interpret the solution.Once we have given every binary ID, we sort the elements according to the objective function Z(x), and use the Montecarlo Index to place them in the Axes.This step is important, since it will provide continuity to the sampling process.Every sample will be surrounded by similar valued samples, ensuring MCSA not to analyze the entire map.
For larger instances, or when the number of elements increase, say we have to deal with a large input that involves many elements and variables.In this case, we use an n − dimensional distribution approach to solve it.As it appears in Table 3, the input data can be distributed through the n − dimensions, for which, we can think of n-arrays that will contain subsets of elements, while the problem size remains the same.Here, we show a multi-dimensional arrangement of the elements from a specific Knapsack Problem instance.
After completing the ordering and sorting of elements along the Axes, we initiate the Exploration phase, as illustrated in Figure 9.This phase involves generating random samples, with the aim of achieving a uniform distribution through-out the search space.Each set corresponds to a sample, where the value is determined by the sum of the weights of elements selected from the respective subsets.It's crucial to emphasize that for a sample to be considered, the total weight sum must adhere to the problem's weight restriction (or not exceed the Knapsack's capacity).Any sample failing this condition will be discarded.The Exploration phase transitions smoothly into Exploitation, employing a reduction strategy through the previously described ''cutting process'' along the f (x) axis.This process eliminates redundant samples and facilitates density analysis, resulting in at least one density cluster, which is then grouped using K-means to identify feasible regions.
To illustrate the methodology, we applied it to solve the EX01 instance from Kreher and Stinson (available online).Before entering the Exploration phase, we experimented with different dimensional distributions.For instance, distributing 12 elements per dimension results in a staggering 4096 possible combinations for each dimension.However, when using a four-dimensional distribution with 6 elements per dimension, there are only 64 combinations per dimension.This approach significantly reduces the elements to be sorted in each dimension.Extending this to a 24-dimensional distribution, where each dimension contains only one element, can lead to a notable reduction in axis sorting time.This reduction can be advantageous in terms of resource utilization, especially when dealing with a large number of samples to be computed.
To tackle the ''EX01'' instance of the Knapsack problem in an 8-dimensional space, where we position 3 elements in each dimension, we end up with 8 combinations for each dimension.The overall problem size remains unchanged at 1.676 E + 07.Before transitioning into the Exploration Phase, a necessary step is to arrange the elements and their combinations along the axes.This setup enables MCSA to generate random samples and construct the Montecarlo map.Keep in mind the weight restriction, denoted as W. Samples failing to meet this restriction are discarded.Consequently, the map consists only of valid, weight-compliant samples.Table 4, shows how MCSA identifies and evaluates each combination.The binary ID plays a crucial role in computing the total Weight and Profit.In the same Table , we depict all the element combinations in Dimension one of the EX01 instance.The remaining elements are distributed across seven other Dimensions, each undergoing the same evaluation process.Subsequently, the combination that maximizes Z(x) while adhering to the weight restriction W is chosen.
Figure 10 provides a visual representation of the process involved in ascending profit ordering and sorting of all subsets by the Montecarlo index.This arrangement is crucial for placing them along the axes.The Figure illustrates the final configuration of each dimension.
The sampling process, denoting the start of the Exploration Phase, leads to the construction of the initial sample map.Subsequently, focusing on reducing the sample map through the ''cutting process'' along f (x).In our experience, the Knapsack Problem exhibits characteristics akin to a continuous function.As the process unfolds, the algorithm gradually transitions into the Exploitation Phase. Figure 11 visually represents this shift into the Exploitation phase, where clustering procedures are deployed to identify feasible regions.
Iterations help improve the solution by performing re-sampling over the feasible regions.In Figure 12, we represent how MCSA obtains the final solution.A single sample is made up from a selection of different combinations in each dimension.In the table, the marked squares represent the combinations, which are: C3 from Dim1, C5 from Dim2, C4 from Dim3, and so on until C2 from Dim8.Notice that dimension 7 returned C0, meaning that the empty set was selected.
The numerical results are presented in Table 5.In general, we have achieved an efficient solution from a computational perspective.The ratio of the number of samples required to obtain a solution to the problem size is approximately 1%.When compared to a published solution obtained using a branch and bound algorithm, our solution demonstrates highquality, as discussed in Section V.

A. THE MULTI-OBJECTIVE KNAPSACK
To broaden the applicability of our methodology to other NP-type combinatorial optimization problems, we consider the Multi-objective Knapsack Problem, which serves as an extension of the original Knapsack Problem.In this variant, our objective is to simultaneously optimize two distinct objective functions: firstly, to maximize the aggregate weight of the elements selected for inclusion in the Knapsack, and secondly, to maximize the profit generated by these chosen elements, all while adhering to the weight restriction.
While the fundamental problem statement of the Knapsack remains consistent, our approach introduces a notable departure in terms of problem-solving strategy.In this context, we treat the weight restriction as both a constraint and an objective function, thereby transforming the problem into a bi-objective Knapsack Problem.Of course, the incorporation of supplementary constraints further amplifies the inherent complexity of the Knapsack Problem within this framework.
The field of multi-objective optimization is concerned with optimization problems involving multiple objectives, which, in general, are conflicting in the sense that no feasible solution exists that is optimal for each of them.Therefore, compromise solutions have to be found.A compromise solution is a solution that cannot be improved concerning one objective without reducing at least one value from the other objective function.
To tackle the Multi-objective Knapsack Problem, we must adapt our methodology to accommodate its specific parameters.As demonstrated earlier, by calculating the power-set, much like in the binary Knapsack case, we can determine the problem's size and subsequently define the boundaries of the n − dimensional search space.The unique approach here involves arranging the elements, each with its corresponding weights and values, within the search space.However, the arrangement is based on the weighted sum of the objectives.
The weighted sum offers the possibility to weigh and combine inputs, in order to create an integrated analysis and account for multiple factors by incorporating an importancefactor, which indicates the elements relative importance.In addition, it admits integer and floating-point values, allowing solving instances of the Knapsack problem in which objects can be split, which is another challenging variant.
The formal definition of the binary Multi-objective Knapsack problem with a single constraint has the following mathematical formulation: where z 1 (x) = n j=1 c i j x j represent the i-th objective function; is the of elements; m is the number objective function; c i represents the profit of item j on criterion i; w j is the weight/cost of item j; and W represents the overall Knapsack capacity.A decision variable x j = 1, if and only if the element j is included in the Knapsack, otherwise x j = 0.In this problem, we assume that W , c i and w j are positive integers and w j ≤ W with n j=1 w j > W for all j ∈ [1,2 , . . ., n] and i ∈

. , m].
The distribution of elements in the search space is made following the same idea of using an n − dimensional array.Afterward, ordering is carried out according to the weighted sum of the elements.The user will decide whether to maximize one objective or another; or set fixed importance factors to each objective, in an attempt to optimize both objectives at the same time.
To solve this instance, we aimed at a compromise solution, or a trade-off, in which two objectives were considered and were given different importance factors.The term of the objectives is normalized on the same scale, same as (4): In multi-objective optimization, the concept of the Pareto front is commonly used since it allows making trade-offs within a set rather than considering the full range of every parameter.This way, we suppose that one of the objectives is to be minimized.The smaller the value of objective1 on the formula, the smaller the difference and the smaller the tradeoff.Ordering with this method guarantees continuity.
MCSA attained a solution that complies with two objectives while fulfilling the restriction that will, in some parts, depend on the user's requirements, since the importance of the objectives will be defined by the user.The following section shows the discussion of the results that were obtained by MCSA when we solved three instances of the Knapsack Problem and a different variant of the Knapsack problem, where we pursue to optimize two objectives at the same time, reaching a trade-off, for which we applied its weighted sum.The detailed results were the best from 200 executions considering different importance factor for each one of the objectives.

V. DISCUSSION OF THE RESULTS
The evaluation of the MCSA heuristic method through the benchmark functions allowed us to have a better knowledge of the problems that we were going to deal with, making us capable of recognizing the patterns of the feasible regions.This was especially the case when we dealt with the Knapsack Problem, since the surface of the problem in which the parameters are unwrapped turned out to be similar to those functions with which we had validated.The classical Knapsack Problem was fully adapted to our parameters through a heuristic approach.This approach involves distributing the elements across dimensions, calculating the power-set to obtain all combinations, and sorting them within the n − dimensional space.The ordering of elements in this n − dimensional search space ensures continuity, which is crucial for the sampling process.This initial step is essential for enabling MCSA to effectively solve the Knapsack Problem.
In Table 7, we present the details of the different Knapsack Problem instances that we solved.The Optimal Profit column shows the known optimal results for EX01 and EX02, which are open-source problems and are available online.In the case of all the synthetic EX's, the optimal values were calculated following the method described in [45].As these are randomly generated Knapsack instances with a vast number of elements, we set an optimal profit to evaluate the performance of the MCSA heuristic.
Overall, we obtained high-quality results for all instances.Table 8, showcases the optimal out-comes in the headers, contrasted with the results obtained through the MCSA heuristic method in the inner cells.Notably, our method demonstrated an impressively minimal ratio between the required number  of samples and the problem size.Furthermore, when examining the Axis sort time, it becomes evident that fewer elements per dimension result in reduced time consumption.This aspect can significantly impact the total execution time and, consequently, resource utilization, as shorter processing times typically indicate more efficient management of computational resources.
The quality of the solutions out of 30 full executions for each dimensional approach were in summary: above 98% for EX01, averaged 86% in EX02, and above 96% regarding EX03.As for EX04, EX05 and EX06 the quality reached 92.77%, 92.23% and 94.17% respectively.Additionally, it is worth emphasizing that MCSA demonstrates its efficacy in tackling high-dimensional challenges, despite the inherent complexity associated with such problems.This is apparent not only from the results achieved in benchmark function tests but also from its performance when addressing combinatorial optimization problems such as the Knapsack problem within a discretized sample space.
These results confer a distinct advantage when compared to other search algorithms that achieve optimal solutions through brute force, albeit at a substantial computational cost in terms of execution time.
The case of the results obtained for the EX01 instance in 8 dimensions, they represent one of the best outcomes.We achieved a maximum profit of 13,436,707, which translates to a solution quality of 99.17% compared to the optimum.Additionally, this solution satisfies the weight constraint W with a total weight of 6,399,256, comfortably below the limit.From a computational perspective, it demonstrates high efficiency, as the ratio of required samples to problem size doesn't even reach 1%.
In the context of the EX's synthetic instances, we would like to underscore two key considerations that underpin the rationale behind proposing these problems.Firstly, we sought to enhance problem complexity by augmenting the number of elements, leading to a substantial growth in problem size.Given that the solution space encompasses the power-set of these elements, the distribution and organization of these elements across multiple dimensions become inherently challenging, thereby warranting the application of heuristic methods.Secondly, beyond the challenges posed by size, we also aimed to establish optimal solutions for these instances, facilitating meaningful comparisons.By defining optimal solutions, we enabled a robust assessment of the quality of our heuristic approach.As our results indicate, our heuristic approach has indeed demonstrated a commendable level of quality.
However, despite the success demonstrated in addressing synthetic instances, the MCSA methodology encounters typical limitations when tackling high-dimensional problems.Experimentation revealed that, with an increase in the number of dimensions, the complexity of the search space grows exponentially, giving rise to the well-known ''curse of dimensionality.''In n-dimensional problems, where the range is conserved, the sheer quantity of potential solutions escalates substantially, posing challenges to the efficacy of any heuristic algorithm.The exploration of the search space becomes more scattered and less focused, potentially resulting in a loss of precision and efficiency.

VI. CONCLUSION AND FUTURE WORK
This paper presents the heuristic MCSA, which is used to solve combinatorial optimization problems.It comprises two Phases: an Exploration Phase where the sampling processes are carried out in a n − dimensional search space; and Exploitation Phase, where we seek to locate feasible regions, where the best values are located.The solution can be improved through iterations on bounded regions, re-exploring the space until a better solution is achieved.
The adapting work needed to bring the parameters into those of MCSA heuristic method is a preliminary task that will enable MCSA to solve optimization problems, making the distribution of the elements in a n − dimensional search space.Furthermore, ordering and sorting of the elements is another essential step in the methodology, since it highlights the continuity in the sampling process.
Furthermore, we conducted an evaluation of MCSA using benchmark optimization functions.Encouraged by the promising outcomes, we applied our approach to tackle a practical optimization challenge, selecting the Knapsack Problem due to its significance in various domains, particularly in engineering and computing.Our results demonstrated high quality, with an average solution performance of 90% across all instances, as compared to exhaustive methods.
We extended our approach to address the multi-objective variant of the Knapsack Problem, aiming to optimize multiple objectives simultaneously.This evolution in the MCSA heuristic methodology highlights its adaptability to diverse problem types that share similarities with the Knapsack, such as resource allocation, production scheduling, and distribution processes.By acknowledging the limitations of existing approaches and identifying areas for improvement, we can assess the overall quality and enable meaningful comparisons.The scalability of the MCSA algorithm in high-dimensional scenarios is now a critical consideration for enhancing performance and maintaining relevance in the challenge of finding high-quality solutions while conserving a low ratio with the number of samples used for this purpose.
A significant step forward involves the parallelization of MCSA, allowing us to address another class of combinatorial optimization problems.This advancement gains particular significance in complex domains, such as healthcare, and decision-making in emergency healthcare settings.The potential of parallelization holds promise in enhancing the efficiency and applicability of MCSA, offering more effective solutions for urgent decision-making scenarios in the healthcare domain.As part of our ongoing efforts to improve the algorithm's performance, we are actively focusing on parallelization during the Exploration phase.This strategic emphasis aims to ensure the generation of a homogeneous sample distributed effectively across the search space, thereby enhancing efficiency and optimizing resource utilization in terms of compute.REMO SUPPI is currently an Associate Professor with the School of Engineering, Universitat Autònoma de Barcelona (UAB), and a Researcher with the High-Performance Computing for Efficient Applications and Simulation Research Group (HPC4EAS).His research interests include HPC, high-performance simulation on ABM models, and big data processing technologies.He is the coauthor of more than 60 scientific papers on the above topics in international journals and conferences and has been a supervisor of seven Ph.D. theses.

DOLORES REXACHS is currently an Associate
Professor with the Department of Computer Architecture and Operating Systems, Universitat Autònoma de Barcelona (UAB), Spain.She has been a supervisor of ten Ph.D. theses and an invited lecturer with Universities in Argentina, Brazil, Chile, and Paraguay.She has coauthored more than 70 fully reviewed technical papers in journals and conference proceedings.Her research interests include parallel computer architecture, parallel I/O sub-systems, fault tolerance in parallel computers, and tools to evaluate, predict, and improve performance in parallel computers.

EMILIO LUQUE is currently an Emeritus
Professor with the Department of Computer Architecture and Operating Systems, Universitat Autònoma de Barcelona (UAB), Spain.He is an invited lecturer at universities in the USA, South America, Europe, and Asia; a keynote speaker at several conferences; and a leader in several research projects funded by the European Union (EU), the Spanish government, and different industries.He has supervised 20 Ph.D. theses and coauthored more than 230 technical papers in journals and conference proceedings.His major research interests include parallel and distributed simulation, performance prediction and efficient management of multicluster-multicore systems, and fault tolerance in parallel computers.

FIGURE 3 .
FIGURE 3. Flowchart illustrating the cutting process for sample reduction.

Figure 5
Figure5offers an illustrative example demonstrating how the implementation of the cutting process results in the removal of redundant samples.Consequently, it presents the advantage of obviating the need for an exhaustive analysis of every single sample.In this scenario, two distinct regions were identified and subsequently selected as the final clusters.At this point, the remaining portion of the sample is discarded, and we proceed to initiate the k-means clustering process.The clusters are formed around four centroids to classify the values based on their distance to the centroid.This allows for the selection of the cluster (or clusters) containing the most promising values for further analysis.The initial selection often provides a solution, which in some cases can be deemed feasible.In other instances, it may be necessary to iterate a specific number of times to refine the solution.Consequently, conducting further exploration within the bounded regions and exploiting the new values becomes crucial for enhancing the solution.In cases where multiple K-clusters contain numerous identical values that can be considered feasible solutions, one might infer that the problem resembles a function with multiple global minima.Thus far, the proposed MCSA methodology enables us to operate within reduced regions of the discrete search space.This is achieved through the combined clustering techniques that constitute a successful Exploitation phase.Moreover, the iterative nature of MCSA helps prevent being trapped in local minima, thereby offering the opportunity to discover better values compared to the ones found initially.

FIGURE 4 .
FIGURE 4. Cutline proposal over a Montecarlo map to reduce the search space into dense regions in a random benchmark function.

FIGURE 5 .
FIGURE 5. Clustered regions that are subsequently consolidated using K-means clustering, over a selected benchmark.

FIGURE 6 .
FIGURE 6. Boxplot showing the results of CEC2017 benchmark tests.

FIGURE 7 .
FIGURE 7. Boxplot showing different set of results of CEC2017 benchmark tests.

FIGURE 8 .
FIGURE 8. Graphic representation of MCSA heuristic methodology to solve the Knapsack Problem: The ordering of the elements in the search space.

FIGURE 9 .
FIGURE 9. Sampling Process: Each sample undergoes evaluation in the function f(x) and must satisfy the restriction W.

FIGURE 10 .
FIGURE 10.Ordering the elements and sorting by the Montecarlo Index.

11 .
Entering Exploitation phase.a)Cut-line to enhance density analysis, b) K-means clustering and iterations.

FIGURE 12 .
FIGURE 12.A single sample is made of the combination of elements in 8Dimensions.
MARIA HARITA is currently pursuing the Ph.D. degree with Universitat Autònoma de Barcelona, Spain.She was a collaborating teacher in subjects, such as distributed systems.Her research interests include algorithm design, heuristic methodologies, and optimization-related techniques.Her research is supported by the Spanish government.ALVARO WONG is currently an Associate Researcher with the Department of Computer Architecture and Operating Systems, Universitat Autònoma de Barcelona (UAB), Spain.He has worked in performance prediction of HPC applications in the ITEA 2 European Project, research centers, and industries.He has coauthored a total of 20 full-reviewed technical papers in journals and conference proceedings.

TABLE 1 .
Evaluation results using the classical benchmarks for optimization.

TABLE 2 .
Results of the tests with the CEC2017 benchmark functions.

TABLE 3 .
[44]i-dimensional distribution of elements in the search space for the Knapsack EX01 instance with 24 elements[44].

TABLE 4 .
Solving EX01: Combinations of the elements of dimension 1 (out of 8).

TABLE 6 .
Results obtained for EX01 instance adapted to a multi-objective type Problem.Showing different importance factors.

TABLE 7 .
Detail of instances EX01 and EX02, and synthetic EX03 to EX06.

TABLE 8 .
Results obtained from the different Knapsack problem instances solved by MCSA.