Big Data-Driven Cognitive Computing System for Optimization of Social Media Analytics

The integration of big data analytics and cognitive computing results in a new model that can provide the utilization of the most complicated advances in industry and its relevant decision-making processes as well as resolving failures faced during big data analytics. In E-projects portfolio selection (EPPS) problem, big data-driven decision-making has a great importance in web development environments. EPPS problem deals with choosing a set of the best investment projects on social media such that maximum return with minimum risk is achieved. To optimize the EPPS problem on social media, this study aims to develop a hybrid fuzzy multi-objective optimization algorithm, named as NSGA-III-MOIWO encompassing the non-dominated sorting genetic algorithm III (NSGA-III) and multi-objective invasive weed optimization (MOIWO) algorithms. The objectives are to simultaneously minimize variance, skewness and kurtosis as the risk measures and maximize the total expected return. To evaluate the performance of the proposed hybrid algorithm, the data derived from 125 active E-projects in an Iranian web development company are analyzed and employed over the period 2014-2018. Finally, the obtained experimental results provide the optimal policy based on the main limitations of the system and it is demonstrated that the NSGA-III-MOIWO outperforms the NSGA-III and MOIWO in finding efficient investment boundaries in EPPS problems. Finally, an efficient statistical-comparative analysis is performed to test the performance of NSGA-III-MOIWO against some well-known multi-objective algorithms.

Value-at-gain VaR Value-at-Risk Web development projects have recently received lots of attention from investors in different countries. In this field, E-portfolio is a new concept that aims to find the best portfolio for social media investors. E-portfolio was first introduced by Chantanarungpak [1] and is collecting and storing portfolio in various formats using computer technology. It is proposed by connecting the concepts of financial portfolio and web development. E-projects portfolio selection problem (EPPS) is a new and applicable optimization problem which seeks to find the best social media-based projects with the highest return and lowest investment risk and is inspired by the modern portfolio selection problem presented by Markowitz [2]. One of the major problems in developing countries, including Iran, is the lack of a suitable investment platform for individuals and organizations. One of the key factors for web development companies is the active participation of people in E-projects. The most important issue regarding investing in an E-project-based company is the selection of the most appropriate investment bonds and the formation of EPPS that is optimal.
Here, one may face large amounts of data on projects that are unused and futile, and this can make the project portfolio selection a difficult task when conventional approaches are employed. Efficiently, processing a very large amount of data can help manifest critical E-projects components which provides opportunities for conducting better investment decisions. Hence, the key role of big data analytic tools becomes clear by reaching better accurate results while helps to devoid miscalculations and human errors. On the other hand, big data analytics by humans is a timeconsuming task and therefore the use of efficient cognitive systems can be employed to process this large amount of data [3], [4].
The features of big data analytics and cognitive computing can be employed to better perceive the issues of privacy, trust and information security [39]. Healthcare and medicine systems have been the first application of cognitive systems using the advantages of big data analytics.
In the EPPS problem, big data-driven decision-making has a great importance in web development environments. As an effective tool, the cognitive computing-based system works by intercepting the command and then drawing inferences and proposing possible solutions. Furthermore, big data provided from social media can be managed effectively using big data analytics process. Accordingly, customer behavior can be recognized and five characteristics of big data, which are known as volume, value, velocity, variety and veracity, can be handled. These features provide the required input information for EPPS optimization. The aforementioned discussion reveals that there is a necessity of a general and multi-purpose approach to optimize the EPPS problem. Hence, the main objectives and contributions of this paper are explicitly stated as follows: • A mathematical model is proposed to address the EPPS based on social media and big data-driven computing. The mathematical model includes minimizing risk in terms of variance, skewness and kurtosis measures, as well as maximizing expected returns.
• A hybrid algorithm named NSGA-III-MOIWO is proposed. It takes the advantages of non-dominated sorting genetic algorithm III (NSGA-III) and multi-objective invasive weed optimization (MOIWO) algorithms at the same time to deal with the complexity of the problems.
• A fuzzy mechanism is incorporated with NSGA-III-MOIWO to handle the uncertainty inherent in the data.
• Extensive simulations are conducted to evaluate the performance of the proposed algorithm. To verify the performance of the proposed algorithm, it was implemented in a range of problems in an Iranian web development company. The data derived from 125 active projects over the period 2014-2018 were employed.
• The proposed NSGA-III-MOIWO is compared against other well-known multi-objective algorithms using the analysis of variance (ANOVA) statistical test.
Finally, the aim of this study is to address the following questions: I. How can an EPPS problem be formulated? II. How can the uncertain nature of the problem be modeled using fuzzy theory? III. How can the big data-driven cognitive computing system be implemented? IV. How can the proposed NSGA-III-MOIWO be designed and validated?
The organization of the remaining sections is as follows. Section II includes a review of the related works. In Section III, the proposed problem of the study is discussed and formulated. Our proposed hybrid algorithm is presented in Section IV and the numerical experiments are provided in Section V. Moreover, Section VI provides a discussion of the results. Finally, Section VII represents the concluding remarks and future research directions.

II. RELATED WORK
This section first highlights the challenges and existing solutions for the industrial portfolio selection and project portfolio problem. Then, the main efforts of recent research on the big data-driven and cognitive computing systems for social media optimization are reviewed.
Kolm et al. [5] extensively reviewed the 60-year history of portfolio optimization and examined various models presented in this area. In the study, they investigated various types of models presented in the field of optimization of stock portfolios under certain, uncertain and different risk types conditions. Literature reveals that the industrial portfolio selection problems are complex in nature, hence it has drawn the attention of the researchers who are involved in metaheuristics algorithmic research. From this perspective, Ehrgott et al. [6] presented a multi-objective model which was influenced by the original Markowitz model [2]. Five functions were used to represent risk and expected return and considered as objective functions for the metaheuristic algorithms. They proposed a multi-objective model and utilized three popular meta-heuristic algorithms including a genetic algorithm (GA) [32], simulated annealing (SA) [33] and tabu search (TS) [34] for solving their functions. Oh et al. [7] implemented a GA for the stock portfolio optimization problem by considering the index fund management. The index fund is one of the most common strategies in portfolio management. They could demonstrate that GA has a significant advantage over the conventional portfolio mechanism and provide an average performance for the flat market. Macedo et al. [8] implemented two very popular multi-objective evolutionary algorithms namely NSGA-II [9] and strength Pareto evolutionary algorithm 2 (SPEA 2) [10]. They also used and compared the technical analysis indicators to have better outcomes in relation to risk-return exchanges. Recently, Babazadeh and Esfahanipour [11] presented a novel multi-period portfolio optimization model based on the mean value at risk (VaR) with consideration of operational and transaction constraints. To solve the proposed problem, they developed an enhanced NSGA-II algorithm and investigated its performance against three other multi-objective algorithms using benchmark problems. Given there is a need to recognize sources of uncertainty in real-world problems, many researchers have increasingly paid attention to the portfolio optimization problem under uncertainty. De Neufville et al. [43] proposed a practical framework to evaluate the design alternatives of flexible engineering projects under deep uncertainty. Similarly, Cardin et al. [44] proposed a systematic four-step methodology based on engineering options analysis, to enhance the lifecycle performance of the engineering systems design by incorporating engineering options so as to proactively face with deep market uncertainty. They evaluated different engineering design alternatives with different economic performance criteria such as VaR, value-at-gain (VaG) and expected net present value (ENPV). Deng et al. [12] applied a new maximin model to select portfolios with the uncertainty for both randomness and estimation in inputs. Besides, Huang's research works [13], [14] on portfolio optimization using fuzzy logic can be considered an important study in this area. Tavana et al. [15] developed a comprehensive methodology consisting of data envelopment analysis (DEA), a technique to solve a fuzzy portfolio selection problem for order preference using similarity to ideal solution (TOPSIS) and integer programming. Among other studies which employed fuzzy logic in their research, Perez et al. [16], considered applying fuzzy constraints, Saborido et al. [17] and Liagkouras and Metaxiotis [18], developed multi-objective optimization algorithms, Liu et al. [19] employed the methods of multi-criteria decision-making (MCDM) and Liu [20], introduced a new fuzzy modeling. Similarly, some studies used uncertain approaches including stochastic programming [21] and robust optimization [22]. As one of the rare research works, Chiu et al. [35] examined the big data challenges of high-dimensional continuous-time mean-variance portfolio selection problems. The aim was to estimate the total error accumulated from the huge dimension of stock data.
Recently, researchers have been working on the application of big data analytics and artificial intelligence in real-world problems [36]- [38]. Real-time decision-making use cases are known as the most advantages of big datadriven cognitive computing systems [39], [40]. Companies are incrementally utilizing real-time data analytics to make more intelligent and faster decisions and remain ahead of competitors. For example, perceiving customer preferences and real-time price changes help industries transform their traditional businesses into modern data-driven ones. The congestion of data, the support of real-time decisions and the application of complicated computational models are different aspects of a problematic context. Therefore, data availability and provision, real-time analytics as well as dynamic algorithms for real-time processing are main future research directions.
As one of the most comprehensive reviews, Gupta et al. [41] investigated the role of big data with cognitive computing using a conceptual model for the future. A survey was conducted on social media big data analytics by Ghani et al. [42]. They investigated recent advances in machine learning algorithms too.
After reviewing and scrutinizing related research works, the identified research gaps are divided into two major parts: 1) The lack of an efficient meta-heuristic algorithm to optimize risk and expected return simultaneously in the EPPS problem. On the other hand, the optimizing risk by using a single measure cannot encompass all possible risks in the E-projects. To the best of our knowledge, the risk criteria including kurtosis, skewness and variance were not studied at the same time in the literature. These concurrent considerations make the study close to real-world condition. Therefore, the focus of this study is on the application of a hybrid NSGA-III-MOIWO algorithm, developed based on the NSGA-III and MOIWO, as one of the most recent and most efficient multi-objective evolutionary algorithms to solve the EPPS considering expected return as well as risk criteria including variance, skewness and kurtosis simultaneously.
2) The importance of this research is due to its comprehensive approach to finding an efficient solution in EPPS problems. Moreover, the main value-added of this research in the field of the big data-driven cognitive computing system is to introduce and use some new optimization techniques which provide a more efficient solution as compared to the existing approaches.

III. PROBLEM STATEMENT
As stated in the previous section, for the first time, in 1952, Markowitz proposed a model for asset portfolio selection using the mean and variance. He formulated the problem as a quadratic programming model with the goal of minimizing the variance of assets sets, provided that the expected return VOLUME 8, 2020 is equal to a constant value. The classic Markowitz's model had several drawbacks which were first discussed by Seyedhosseini et al. [23].
Here, we develop a modified fuzzy model based on the Markowitz's model, in which the risk aversion coefficient is used, can be presented by (1)-(6) as follows [24]: where x i and x j are the proportions of total capital budget invested in E-projects i and j, respectively. Moreover, σ ij is the risk of selecting E-projects i and j simultaneously, and µ i represents the expected return value for i th project. Moreover, K is the portfolio size and the number of selected E-projects, and λ is a parameter that takes value between 0 and 1. For instance, assume λ = 0, then the total amount of the weighting coefficient is assigned to the return, ignoring the risk, so the portfolio with the highest return is chosen whereas by assuming λ = 1, the total weighting factor is assigned to the risk factor, regardless of the return, so the portfolio with the minimum risk is selected. Equation (1) represents the objective function for the minimization of risk. When λ takes a value between zero and one, portfolios are optimized by considering both risk and return factors. When the value of the coefficient λ increases, the objective of risk minimization becomes more important. As a result, the value of coefficient (1 − λ) is decreased, then the objective of return maximization becomes less important. Equation (2) shows that the sum of investments for all stocks equals the total amount of budget and forms the relationship between all decision variables. Equation (3) indicates the maximum number of E-projects to be selected where z i is a binary variable which can take value 1 when i th project is in the E-projects portfolio. Equation (4) shows that ε i and δ i are the lower and upper bound of the i th variable, i.e. i th project in the portfolio.

A. FUZZY PORTFOLIO OPTIMIZATION MODEL
In real cases of cognitive analysis, the variety and uncertainty of input data is an important issue that analyzers should find a suitable approach for this matter. In the fuzzy approach, it is possible to define the uncertain and approximate parameters of the objective function and constraints. So it seems that using a fuzzy approach can be very useful when we face the lack of knowledge, experience or information that can be definitively defined in the cognitive analysis.
In order to formulate the portfolio mathematical model with an uncertain return, each uncertain parameter is considered as a triangular fuzzy number. The distribution of the triangular fuzzy number is represented in Fig. 1. Moreover, the membership function of a triangular fuzzy number is presented in (7).
Now, consider ξ i as the fuzzy number for the return of each project, and x i as an investment ratio required for project i. Essentially, the return (ξ i ) for each project is calculated using (8), where p i , p i and d i are the value of project i at the present time, the estimated price during the intended period and the derivation of estimated price, respectively.
Since p i and d i are uncertain variables in the present time, they are regarded as fuzzy variables, where ξ i is a fuzzy triangular parameter as ( . By the consideration of this assumption, the return of a project portfolio with n project with a weight vector x 1 , x 2 , x 3 , . . . ., x n meaning ξ = n i=1 x i ξ i is also a fuzzy variable. In order to formulate the mean and deviation indicators of the portfolio, the credibility of a fuzzy number (Cr) is applied as the mean of its possibility and necessity. A fuzzy parameter might fail even if its occurrence possibility is equal to one and it might occur even if its necessity is equal to zero. That is why the credibility criterion uses the combination of these two functions and in fact, plays the role of occurrence possibility in fuzzy conditions. According to Liu and Liu [24], the meanvariance, skewness and kurtosis of a fuzzy parameter are calculated based on (9)-(12), respectively: where r is a random variable in the range of lower bound and upper bound of the predefined fuzzy number. Now, instead of the criterion of variance, we can use Skewness (Sk) and Kurtosis (Ku). To provide efficient solutions that fully cover the risk of the EPPS problem, a quad-objective model is represented through (13)- (16): minimize subject to Equations (2) − (6).
Equation (13) lists risk minimization in the form of a variance. Equations (14) and (15) indicate risk minimization in the form of skewness and kurtosis criteria. Equation (16) maximizes the total EPPS problem returns.

B. DEFUZZIFICATION OF THE PROPOSED MODEL
To solve the proposed model, the presented model needs to be defuzzified first. To do this, the materials used in the previous section are used to convert fuzzy parameters to crisp parameters.
According to Anavangot et al. [3], if a triangular fuzzy number is represented as (d 1i , d 2i , d 3i ), the variance of this fuzzy number is calculated using (17): where α i and γ i are the maximum and minimum deviation of the fuzzy numbers that are calculated through (18)- (19): Moreover, according to Hao and Liu [25], the Skewness and Kurtosis of this fuzzy number are calculated through (20)-(21), respectively.
Finally, considering all the assumptions about optimizing the EPPS problem, the proposed model that seeks to find an efficient boundary for investment with fuzzy information is presented as follows. In this model, fuzzy notations for all related parameters are shown.

IV. PROPOSED NSGA-III-MOIWO
The proposed NSGA-III-MOIWO algorithm is developed by hybridizing NSGA-III and MOIWO algorithms. MOIWO algorithm is a numerical optimization algorithm inspired by weed growth in nature which was first introduced by Mehrabian and Lucas [26] for its single-objective version; i.e., invasive weed optimization (IWO). Some of the specific features of IWO compared to other evolutionary algorithms are the mechanisms of reproduction, spatial dispersal, and competitive exclusion [26]. Basically, weeds are very stable and adaptable to environmental changes. This algorithm works simply but efficiently in convergence to optimal solutions. As IWO has some strong operators to find neighborhood solutions, it has been selected to propose a hybrid algorithm in this research. By inspiring and simulating their properties and behavior, the authors developed a meta-heuristic optimization algorithm. Its main procedure consists of the following steps:

A. STEPS OF THE PROPOSED NSGA-III-MOIWO
To provide a new hybrid algorithm based on NSGA-III and MOIWO, the ideas presented in both algorithms are delicately combined. The rationale behind proposing such a hybrid algorithm is to overcome the drawbacks of MOIWO.
In the proposed hybrid algorithm, a crossover operator of the NSGA-III is employed for crossover and reproduction. The steps of the proposed NSGA-III-MOIWO algorithm are as follows: Step 1. Generate a random population and evaluate their objective function.
Step 2. Reproduce based on the GA. Sub-step 2.1. Use the roulette wheel method to choose two solutions randomly.
Sub-step 2.2. Apply the one-point crossover method to produce two new solutions.    Step 3. Conduct competitive elimination based on the weed algorithm mechanism.
Step 4. Identify the non-dominated solutions and introduce them in Pareto fronts.
Step 5. Check the termination condition, if it is met then go to Step 7, otherwise go to Step 6.
Step 6. Implement Niche preservation operator to specify the next-generated solutions and go to Step 2.
Step 7. Report the best Pareto front. The pseudo-code of the proposed algorithm is presented in Fig. 2.

B. SOLUTION REPRESENTATION, ENCODING AND DECODING PROCEDURE
To demonstrate an EPPS problem, an encoding procedure with floating values between 0 and 1 is used. The length of the solution representation is 2N, divided into two segments. The values in the first segment of the solution representation determine which E-projects are selected for the project portfolio. In the decoding procedure, elements with a value greater than 0.5, will be in the E-projects portfolio. To determine the proportion of investment for each project, the second segment of the solution representation is used. Each number in this segment shows the percentage of investment. For example, Fig. 3 represents a solution for N = 5.
In solution represented in Fig. 3, the projects of 3 and 5 have been selected and 25% of the capital is invested in project 3 and the rest is invested in project 5 which is a feasible solution.
In these circumstances, however, the generated solution may violate the budget constraints of the model. To convert the infeasible solution into the feasible one, a repairing mechanism is implemented.

V. NUMERICAL EXPERIMENTS
To verify the performance of the proposed NSGA-III-MOIWO, data on the returns of 125 active E-projects were collected from a web development company in Iran from 2014 to 2018. The applied data set is driven from the raw data which were collected during these five years. The raw data was the net profit of each 125 projects in each year. To use these data, it was necessary to transform them into fuzzy return parameters.
In this research, data processing is performed by converting raw data to the fuzzy return parameter by calculating average, maximum and minimum of the net profit over five years. Accordingly, lower bound, middle bound and upper bound of the fuzzy return parameters are calculated. For each project, the lower bound is equal to the minimum profit, the middle bound is equal to the average return and the upper bound is the maximum profit generated from the beginning of 2014 to the end of 2018 including 60 months.
Subsequently, dimension reduction is performed by removing E-projects with less 5 years data, and the feature extraction procedure was applied to calculating the net profit of each project which is the basis of the fuzzy return parameter in our model.
The input parameters include the fuzzy return and risk of investment. Fuzzy return is obtained from projects' net profit as previously explained. The risk of investment is calculated based on Eqs. (13)- (16). Then, the optimization phase by using the proposed hybrid meta-heuristic algorithm is applied. These processes are depicted in Figure 4.
To evaluate the efficiency of the proposed hybrid algorithm, its performance is compared with two highperformance multi-objective evolutionary algorithms of NSGA-III and MOIWO [28]- [30]. The algorithms were coded in MATLAB R R2016 software and the key results are reported and analyzed.

A. INPUT PARAMETERS SETTINGS
To implement and evaluate the proposed meta-heuristic algorithm, it was coded in MATLAB software. At each iteration, the value of the objective function, efficiency, and the risk of the project portfolio along with the computational runtime are reported. The parameters of the problem were set according to the list below: Risk-Averse Coefficient: As outlined in Section III, in this algorithm, the risk factor is used to trace the efficient boundary, which its value lies between 0 and 1. In this algorithm, in order to map the efficient boundary in each iteration, the risk-aversion coefficient varies by step size 0.1 unit. With this step size, 10 points of the efficient boundary will be  achieved, which allows for an accurate comparison of the points.
Lower Bound (ε k ) and Upper Bound (δ k ) for Each Decision Variable: If there is a constraint associated with an investment in an E-project, the minimum and maximum ratio of investment in that project can be considered in the problem. In this research, for all selected E-projects, the minimum and maximum investment ratios are considered equal to 0.001 and 1, respectively.
Project Portfolio Size (K ): This parameter specifies the number of E-projects to be selected for investment. In order to carefully examine the EPPS optimization, the K value is 3, 5, 10, 20, and 50.

B. IMPLEMENTATION OF THE ALGORITHM
According to the descriptions, the algorithms of NSGA-III-MOIWO and NSGA-III were implemented on different project sizes and different risk aversion coefficients. For the demonstration purpose, the related efficient boundaries for each E-project were plotted. Below are the results of each computer experiment: In the first step, the size of the E-projects portfolio is equal to 10, and then for different values of the risk aversion coefficient, the returns and risk of investment as well as the value of the objective function are calculated.
The linear combination of risk and returns is then calculated. These results are shown in Table 1. Table 1 indicates that with the increase of the risk-averse coefficient (λ), the risk of the investment portfolio decreases. The reason for this behavior is that by increasing the risk aversion coefficient, the effect of variance increases and the effect of the return decreases. This issue is solved with the NSGA-III algorithm. It should be noted that this algorithm does not need to convert risk and return to a goal due to its multi-objective general structure, and so both objectives can be optimized simultaneously. This process is also performed in the IWO algorithm. To better understand the performance of the three algorithms, it is necessary to examine the linear risk-return combination for different risk aversion coefficients. Fig. 5 represents the graph of the objective function resulted from each of the risk aversion coefficients.
As can be seen, the values of the objective function for different risk aversion coefficients in Fig. 5 are well characterized by the difference between the solutions obtained from MOIWO and NSGA-III algorithms. Results show that the objective function of MOIWO is less than NSGA-III in terms of almost all different risk-averse coefficients except cases where the parameters are 0.8 and 0.9. Furthermore, NSGA-III-MOIWO algorithm has a significant superiority to the other two algorithms given the risk aversion level. VOLUME 8, 2020  In other cases, it also has a good-enough advantage over other algorithms. Therefore, it can be concluded that this hybrid algorithm outperforms the other two basic algorithms.
When K = 50 The same procedure was performed for the EPPS with a size of 50 E-projects. The relevant results are presented in Table 2.
As shown in Table 2, by increasing risk aversion, the return on investment portfolio decreases. This behavior indicates that when the risk aversion increases, the focus of the problem is to minimize the risk and pay attention to maximizing returns, and as a result leading to the lower objective function value. As illustrated in the previous process, in this example, the problems defined by the NSGA-III algorithm and the IWO algorithm were also solved. The related obtained results are evaluated and schematically illustrated.
In Fig. 6, the value of the objective function is shown for a set of different risk aversion coefficients. As can be seen, in large dimensions for project sizes, the performance difference between MOIWO and the NSGA-III algorithm is very large, so that in all examples of MOIWO we have less objective function relative to NSGA-III algorithm. Also by increasing risk aversion, this difference gets even more than before. This shows that with the increase in the dimensions of the problem, the effectiveness of the MOIWO is more than other meta-heuristic methods. In examining the efficiency of the NSGA-III-MOIWO algorithm, in all states, except for  the 0.7 risk level, the output of NSGA-III-MOIWO is better than the other two algorithms. This superiority is reduced by increasing the risk aversion factor.

C. EVALUATION OF THE ALGORITHMS CONVERGENCE
One of the quality measures of meta-algorithms is how fast they converge to desirable solutions. In this part of the numerical results, the convergence of the proposed hybrid algorithm is compared with the NSGA-III and MOIWO algorithms in terms of a different number of repetitions. In this regard, the replication number for each algorithm is considered to be equal to 100. The weighted sum of risk and return is calculated using the 50% risk aversion coefficient for each of these algorithms. The results for K = 10 and K = 50 are presented in Figs. 6 and 7.
The results in Fig. 7 shows that the NSGA-III-MOIWO algorithm rapidly converged to its minimum level at iteration 50 while the NSGA-III and MOIWO algorithms converged to their minimum value at iterations 60 and 65 respectively. On the other hand, the convergence number in the NSGA-III-MOIWO algorithm is lower than the other two algorithms. This suggests that the proposed algorithm of this study converges faster and provides a higher set of quality solutions.  As shown in Fig. 8, the NSGA-III-MOIWO algorithm converged to iteration 40. This situation happened for the NSGA-III and MOIWO algorithms in the iterations 58 and 71 respectively. Furthermore, the value that the NSGA-III-MOIWO algorithm converges to, is lower than the other algorithms' values. Considering the results of Figs. 7 and 8, it indicates that by increasing the size of the EPPS, the efficiency of the proposed algorithm improves in terms of quality and computational runtime as compared to the other two algorithms.

D. COMPARISON WITH WELL-KNOWN MULTI-OBJECTIVE ALGORITHMS
Now, this subsection provides further investigation and validation on our proposed algorithm against well-known multi-objective meta-heuristic algorithms; i.e., NSGA-II, NSGA-III, MOIWO, multi-objective particle swarm optimization (MOPSO) [31] using five quality measures include of f best , f worst − f best , number of fitness evaluation (NFE), GAP and run time. The obtained results are represented in Table 3. Here, f best and f worst denote the best and worst-found objectives, respectively. The lower values for NFE show the superiority of an algorithm. GAP index is calculated based on the best f best that has been obtained by NSGA-III-MOIWO. As can be seen in Table 3, NSGA-III-MOIWO outperforms the other algorithms. To further clarify and demonstrate the efficiency of the proposed algorithm, the results are shown in the box plot for each tested algorithm as shown in Figure 9.
As the reported results are in a multi-objective environment and some algorithms may find some suitable solutions in their Pareto optimal solutions, it is necessary to implement a comparison based on the statistical test between NSGA-III-MOIWO and other tested algorithms. Therefore, in order to have a comprehensive comparison between the studied multiobjective algorithms, the ANOVA test is applied in SPSS software under a 95% confidence level. The obtained results VOLUME 8, 2020 are summarized in Table 4. According to Figure 9, the proposed hybrid algorithm has the lowest volatility over other metaheuristic algorithms. Moreover, the best found objective values in NSGA-III-MOIWO are less than the other ones which confirm the efficiency of this hybrid algorithm.
In Table 4, df stands for the degree of freedom and Sig. is a significant level. As shown in Table 4, Sig. values are greater than the risk level (0.05) and it is concluded that all studied algorithms have a significant difference in Pareto optimal solution with each other. In order to find the best algorithm, the total sum of squares is checked. The sum of squares in NSGA-III-MOIWO is about 7356.3 which is lower than corresponding values in other algorithms. Hence, it can be concluded that the best Pareto solutions were obtained by NSGA-III-MOIWO algorithm.

VI. DISCUSSION
In this section, the answers to the questions raised in the introduction section are discussed. At the first stage of the suggested methodology, the EPPS problem was formulated on the basis of the portfolio selection problem. To study the uncertainty essence of the parameters, the fuzzy credibility theory was applied to the developed model based on the concepts of the possibility and necessity. Moreover, the big data-driven cognitive system design was conducted by data dimension reduction, feature extraction/selection and data pre-processing. As the final part of the proposed methodology, NSGA-III-MOIWO was designed step-by-step in order to efficiently optimize the problem. Accordingly, the advantages of NSGA-III and MOIWO were combined and incorporated into the single hybrid algorithm. To validate the efficiency and performance of the algorithm, the results were first compared to the results of its basis algorithms; i.e., NSGA-III and MOIWO. To further validate the efficiency of the proposed algorithm, its results were then statistically compared to the other two well-known algorithms including NSGA-II and MOPSO.
The obtained results indicated that the proposed NSGA-III-MOIWO can efficiently solve the problem and outperform the other algorithms. Therefore, the proposed methodology of the research can be considered as a superior choice to study the big data-driven cognitive computing system in optimizing project portfolio selection problems.

VII. CONCLUSION AND OUTLOOK
In this research, to solve the EPPS problem on social media platforms, a mathematical model based on a new approach was formulated using a novel multi-objective metaheuristic algorithm named as NSGA-III-MOIWO. The proposed model aimed to minimize the risk of EPPS problems including variance, skewness and kurtosis while maximizing their returns. To deal with the large amounts of data in the problem under consideration, a big data-driven decisionmaking procedure in cognitive computing systems was considered. The relationship between these two objectives (i.e., risk and return) was examined given a risk aversion coefficient. To solve this problem, a hybrid multi-objective algorithm based on NSGA-III and MOIWO (NSGA-III-MOIWO) was developed and implemented in MATLAB R . Then, the model was verified through solving different EPPS problems and risk factors of various choices. Numerical results showed that the proposed hybrid algorithm has a higher performance than its two basic algorithms, named NSGA-III and MOIWO algorithms. The proposed algorithm has the potential to solve EPPS problems in a reasonable computational time. Moreover, an ANOVA statistical test was performed to compare the efficiency of NSGA-II, NSGA-III, MOIWO and MOPSO. The results showed that the proposed algorithm outperforms the rest of the algorithms in terms of best fitness found, fitness diversification, fitness evaluation, GAP and run time value. Therefore, this algorithm can be considered as one of the most effective algorithms for optimizing EPPS problems.
The most important advantage of this research is that it can create the most efficient portfolio for E-projects based on social media using the least possible information. According to the proposed methodology, it is enough to design a cognitive system to determine the risk and return values. Then, the best portfolio can be achieved using the suggested hybrid algorithm within a reasonable amount of time. On the other hand, the main disadvantage is that the proposed methodology includes random operators and the result of each implementation is slightly different and based on the output results, the amounts of f best and f worst are always slightly different. In this regard, it is necessary to appropriately adjust the parameters of the algorithm to decrease these differences.
For future research works, one can further investigate VaR and employ it as another well-known risk measure in the problem to have a more comprehensive evaluation regarding the risk of the E-projects portfolio. Moreover, the application of other uncertainty techniques in the problem can be an interesting research topic and the results can be compared with our proposed fuzzy model as well as with other approaches such as grey systems, robust optimization and stochastic programming. ARUN  ALIREZA GOLI was born in Isfahan, Iran, in 1989. He received the bachelor's degree in industrial engineering from the Golpayegan University of Technology, Iran, in 2013, and the master's degree in industrial engineering from the Isfahan University of Technology, Iran, in 2015, and the Ph.D. degree in industrial engineering from Yazd University, in 2019. He has published more than 40 articles. His current research interests include supply chain management, robust optimization, artificial intelligence, portfolio management, and meta-heuristic algorithms. He is currently working as an Active Editor/ Reviewer in different reputed journals.
ERFAN BABAEE TIRKOLAEE was born in Sari, Iran, in 1990. He received the bachelor's and master's degree in industrial engineering from the Isfahan University of Technology, Isfahan, Iran, in 2012, and the Ph.D. degree in industrial engineering from the Mazandaran University of Science and Technology, in 2019.
He is currently a Teacher in some famous universities. He has published more than 50 research articles in reputed journals and conferences. His research interests include waste management, supply chain management, routing problems, uncertain optimization, mathematical modeling, and metaheuristic algorithms. He has been serving as an Editorial Board and a Reviewer Member of reputed journals and conferences. He has been verified as the scientific elite by the Young Researchers and Elite Club, Islamic Azad University, in 2017, and the Iran's National Elites Foundation, in 2018. HARI MOHAN PANDEY is currently a Lecturer of computer science with Edge Hill University, U.K. He is specialized in computer science and engineering. He has published over 50 scientific articles in reputed journals and conferences. He is the author of various books in computer science engineering (algorithms, programming and evolutionary algorithms). His research interests include artificial intelligence, soft computing techniques, natural language processing, language acquisition, and machine learning algorithms. He has been given the prestigious award The Global Award for the Best Computer Science Faculty of the Year 2015, the award for completing INDO-US Project GENTLE, the Certificate of Exceptionalism from the Prime Minister of India, and the award for developing innovative teaching and learning models for higher-education. He served as a Session Chair, a leading Guest Editor and delivered keynotes.
WEIZHE ZHANG (Senior Member, IEEE) is currently a Professor with the School of Computer Science and Technology, Harbin Institute of Technology, China. He has published more than 100 academic articles in journals, books, and conference proceedings. His research interests are primarily in parallel computing, distributed computing, cloud and grid computing, and computer networks.