Service Composition and Optimal Selection in Cloud Manufacturing: State-of-the-Art and Research Challenges

Increasing interest in the field of Cloud Manufacturing (CMfg) has been witnessed over the last few years. This study aims to identify current and state-of-the-art techniques and to synthesize quality attributes, objectives, and evaluation methodologies for service composition and optimal selection (SCOS) in the field of CMfg. We used a systematic literature review (SLR) methodology for a thorough analysis of 46 shortlisted primary studies, from a total of 5872 accumulated studies from ten electronic databases. NVivo analysis software was used for data coding and qualitative analysis. A review scope was primarily devised based on research goals, and to uncover potential search strings; a pilot study was formulated. Secondarily, research identification, key data extraction, and deductive coding-based data analysis were performed. Multi-variant distribution approaches were adopted for data categorization. We found that the research in this domain has increased due to the rapid manufacturing urge. Although a few studies were based on industrial evaluations; however, scientific and empirically validated methodologies are still needed in this domain. This study lays an overview of SCOS in the field of CMfg and enlightens the identified future research areas.


I. INTRODUCTION
In the modern world, consumer-centric manufacturing has taken over the product-oriented manufacturing archetypes [1]. As an example, cloud manufacturing (CMfg) is one of such new networked manufacturing paradigm based on a service-oriented model [2], [3], that enables users to accomplish any personalized manufacturing tasks by choosing required manufacturing resources, configuring these as needed, and utilizing on-demand [4], [5]. It enables a multitude of collaborations between different enterprises providing manufacturing resources regardless of their structures and The associate editor coordinating the review of this manuscript and approving it for publication was Md Zakirul Alam Bhuiyan . distances [6], by leveraging the latest technologies such as the Internet of Things (IoT) [7], [8], cloud computing [9], [10], integrating multiple clouds in multi-centric management [11], utilizing the service-oriented approaches [12] as well as other techniques [13].
The CMfg evolves, adapting the latest scientific advancement to cope with the variations in market demands enriching enterprises to respond accordingly [14]. However, CMfg is prone to various challenges such as service discovery, matching, and scheduling [15], classification of the resources, encapsulation, optimal selection, and composition, as well as adopting new architecture to support various technologies, and entirely different business models [6]. CMfg services are invoked based on user requirements distributed into a single service or multi-service requirement task that requires multi-variant services based on the sub-task distribution [16]. Quality of service (QoS) and logistics are amongst a few of the CMfg restrictions while dealing with optimal selection and service composition in a multi-service requirement task [17]. This identification of the optimal composition of services from the available resource pool for a particular manufacturing task is an Np-Hard problem [18]. Apart from identifying potential service combinations to tackle complex manufacturing tasks, service composition and optimal selection also play a vital role in the flexibility of cloud management service workflow and, therefore, has received significant importance from the research community [19]. However, limited research contributions are related to systematic study. This research proposes an in-depth systematic review to highlight effective mechanisms and various challenges governing these. A few of the aspects covered as a contribution of this article are as follows: 1) CMfg mechanisms for service composition and optimal selection have been comprehensively analyzed. 2) Popular publisher networks have been leveraged to identify the state-of-the-art research papers, high impact journals, and famous authors. 3) Existing case studies have been evaluated to elaborate on the methods applied to the datasets and the simulation tools used for the studies. 4) Studies have been evaluated qualitatively based on the credibility of research, appropriate documentation, adequacy of details, and accuracy of evaluation. 5) A thorough conclusion of the selected studies has been provided along with the identification of research challenges and potential future research aspects.
The rest of this article is organized as follows: Section II presents service composition and optimal selection in CMfg. Section III presents the related work survey. Section IV covers a broad aspect, including research methodology, search strategy, data extraction, and article classification strategy.
Research results, along with discussions, are brought in Section VI. Open issues and future roadmaps are specified in Section VI. Section VII details the validation threats, and its mitigation strategies, and limitations of the study. Finally, Section VIII outlines the conclusion of this research.

II. SERVICE COMPOSITION AND OPTIMAL SELECTION PROBLEM IN CLOUD MANUFACTURING
In CMfg, the optimal utilization of enterprise resources as per the user requirements is a challenging task. In the literature, SCOS has proven as a primary technique to tackle the issues covering a broad spectrum of service compositions [20]. The SCOS process is governed by task de-composition, service discovery, and service composition [6]. The user requirements submitted through CMfg are primarily evaluated in order to be distributed into small sub-tasks [19] as shown in Figure 1 where the task (T 2 ) has been distributed into T 2.1 and T 2.2 . The second stage finds functional matching services for each sub-task [21]. In the final stage, optimal services are identified by applying a preference-based task-oriented composite service approach [22] as elaborated in Figure 1.
Primarily the flow and functional match decompose the required tasks into various sub-tasks and abstracted service compositions. A logical resource service order is achieved by assembling the required resource service composition to tackle the requested manufacturing task. The resource service composition correlations affect the QoS, and therefore, service composition should acknowledge these correlations. Due to the enormous resource service scalability in CMfg, the correlation alteration is considered in the fabrication stage where concrete source services are leveraged to map non-concrete composite service in order to select the optimal composition. The execution stage achieves the required outcome by invoking the concrete resource services bound by leveraging optimal composition. The QoS indexes in CMfg are used to evaluate functional and non-functional properties. The literature work related to QoS parameters usually relies on ten indexes, consisting of cost, trust, reliability, response time, availability, execution time, energy, scalability, maintainability, and reputation. However, the total fitness function comprises satisfaction [7], throughput [23], success rate [24], service coverage [25], and various other service performance indexes. In order to generate optimal composite CMfg services based on the QoS of each concrete service combined, relevant factors are aggregated from all selected services [26]. The sequential workflow primarily consists of four patterns; subtasks accomplished in sequential turns, whereas simultaneous subtasks are performed in a circular pattern, a particular assessment of subtasks is evaluated in selective and cyclically accomplishment of subtasks is performed in the parallel structure of composite service accomplishment path [7].

III. RELATED WORK
It is evident in the literature that the service composition and optimal selection research has an impact on various backgrounds that include cloud computing, cloud manufacturing, and IoT. However, due to rapid variations in the field, a thorough investigation in CMfg is required. This VOLUME 8, 2020 section elaborates on the existing review articles regarding service composition and the optimal selection problem in CMfg. In [27], a systematic literature review (SLR) regarding the state-of-the-art service composition approaches in cloud manufacturing has been presented. The articles have been distributed based on the objective function, highlighting various aspects. It also includes research challenges and future aspects. However, it does not contain a quality assessment of the selected studies. An SLR in the perspective of computational intelligence for a QoS-aware cloud service composition is presented in [28]. The study distributed the selected articles into heuristic, non-heuristic, and meta-heuristic, and presented challenges as well as future directions. A thorough literature review, including extensive CMfg details as well as various method classification based on the algorithm optimization and other factors, has been presented in [6]. However, it does not elaborate on the article selection method and is also a non-systematic survey. A comprehensive review of CMfg issues has been highlighted in [29], evaluating characteristics of service composition and aggregation, and examined novelty of existing research and future aspects focused on few methods without detailed evaluation of each of these.
In order to summarize the reviewed studies in the field of CMfg SCOS leveraged in this study, Table 1 elaborates key parameters including citations, QoS parameters, covered years, references used, publishers, along with covered aspects. Seven of the ten studies have used the SLR methodology to evaluate the service composition and selection approaches, whereas other studies are the surveys related to SCOS. These studies have presented a thorough foundation of the field; however, these have some element of weakness as follows.
• Classification of reviewed techniques has not been considered in some studies.
• Most of the articles have failed to review the latest techniques put forward in the last three years.
• Investigated techniques lack a few of the essential parameters in most of the studies.
• Cloud manufacturing is not a core focus of most of the studies.
• Effective methods have not been identified by most of the studies.
• Discussion of open issues and potential future direction have been ignored in some studies.
• Most of the studies have failed to perform an extensive literature review, leveraging various publisher networks and digital libraries; to evaluate the top journals, authors, datasets, simulations, algorithms, case studies, and other essential factors.
• Qualitative analysis has not been evaluated by most of the studies

IV. RESEARCH METHODOLOGY
In recent years systematic literature review (SLR) has gained popularity in the field of computer science [41]. An SLR study assesses and interprets scientific evidence in a well-defined approach in order to solve a specific topic by answering research questions in an unbiased manner, due to the detailed analysis based on a scientific methodology that ensures transparency and is replicable [42]. In this article, we provide SLR for the SCOS in CMfg to accomplish a thorough and in-depth inception of current research, as elaborated in Figure 2. Three phases govern this study; phase 1 is the planning phase, where review scope is developed, based on the research goals, and the pilot study conducted to uncover potential search strings. Phase 2 covers a broad spectrum of activities, including research identification, selection of papers, extraction of key data, deductive coding-based data analysis, and synthesis. Finally, the last step is the documentation of the findings. The research questions formalized in the following sections cover various aspects of SCOS in CMfg.

A. OBJECTIVES AND RESEARCH QUESTIONS
To systematically define the study objectives, we adapted the Goal-Question-Metric (GQM) approach [43]. This research's consolidated objective is defined as: ''Analyzing the service composition and optimal selection approaches from the viewpoint of a researcher for characterization concerning research intensity and characteristics, algorithms, evaluation methodologies, and service quality attributes, in the context of cloud manufacturing.'' Research questions (RQ) based on research objectives, along with their rationales, are as follows: • RQ1: The research pertaining to service composition and optimal selection in the context of CMfg is based on what characteristics and intensity? Identification and structure of selected primary studies based on the type of research and contribution along with the study quality.  area and evaluating top quality publishers, journals, and papers in the field of SCOS in CMfg. RQ1 aims the service composition and optimal selection in the field of CMfg to identify characteristics and research intensity insights. Table 7, contains information regarding the analysis we conducted on the bases of bibliographic data, the type of the research conducted, and the contributions put forward. RQ2 focuses on an extended analysis of QoS parameters to identify, synthesize, and evaluate the approaches, datasets, algorithms, and simulation tools, objectives, and case studies. Finally, RQ3 summarizes the existing research put forward for the service composition and optimal selection problem in the field of CMfg.
Therefore, this research is conducted to investigate the methodology used in CMfg and the current research trends based on high-quality papers from top journals. This research also highlights the strategies and features as well as characteristics devised in current research. Based on the aims mentioned above, we have drawn research questions that unravel the adapted strategies, investigation techniques, and evaluation criteria for researchers' outcome.

B. SEARCH STRATEGY
In order to answer the research questions in an SLR, it is essential to identify the relevant studies [44]. In the literature, various approaches have been put forward to develop and evaluate the search strategy [45]. To achieve gradual improvement in the search string, we adapted an iterative approach for this study. To appropriately retrieve relevant studies with minimal noise and to find the optimal search strategy, several pilot study iterations were conducted (in May and June 2020) on various bibliographic databases. Initially, the pilot study included the following search string: (''cloud manufacturing'' OR CMfg) AND (''service composition'' OR ''service selection''). To include related topics and collect multifaceted data, these two main elements were rephrased based on various combinations of synonyms, along with the use of logical operators (''And'' and ''OR'').
As elaborated in Table 2, ten data sources were employed to conduct extensive research in the field of SCOS in CMfg. These data sources are either directly offered by the top publishers or highly acceptable in the field of computer science. The selected data sources include Google Scholar, Web of Science, Scopus, IEEE Xplore, ACM Digital Library, Microsoft Academic, DBLP, Semantic Scholar, Taylor & Francis, and MDPI. Using the search string on multiple fields, including title, abstract, body, and other sections of the paper to expand the spectrum in a thoroughly conceivable range, led to a set of 5872 results, as shown in Table 3.

C. SCREENING OF RELEVANT PAPERS
In order to minimize the threat of missing relevant studies, the screening process comprises the criterion for inclusion and exclusion of studies, well-defined selection process, and VOLUME 8, 2020  inclusion of additional studies by leveraging the snowball sampling process. Each of these stages is expressed in the following sections.

1) SELECTION CRITERIA
In this literature review, selected studies were included if it presented the scientific contribution to the body of service composition and optimal selection in the field of cloud manufacturing. The search results, including both theoretical and empirical studies, are based on the following criteria: • Studies that addressed the SCOS in CMfg at any level of abstraction, including algorithms, dataset, simulation, quality attributes, and other factors.
• Studies that identified various approaches, including single and multi-objective, for SCOS in CMfg. Furthermore, for the exclusion, the criteria adopted are as follows: • Studies addressed the SCOS problem but not in context to the CMfg.
• Studies addressed various aspects in CMfg other than SCOS.
• Duplicate articles, non-peer-reviewed papers, prefaces, keynotes, speeches, introduction to special issues, call for papers, books, and other content types were excluded.
• Studies put forward in conferences were excluded in order to limit quality primary studies.

Non-English
• studies that were disseminated in other languages.

2) SELECTION PROCESS FOR THE PRIMARY STUDIES
Selected primary studies have been screened and designated on the bases of the process elaborated in Figure 3. The results extracted from various databases were added to a reference management system (Paperpile) to organize and cite studies in this paper appropriately. Primarily all the retrieved studies (5872) were superficially evaluated. These results contained many irrelevant papers that were not explicitly related to the SCOS problem in CMfg. Therefore, the second iteration restricted the search to title only. The selected online databases were restricted to retrieving papers containing the keywords in the title, and therefore, 415 primary studies were extracted at this stage. The second iteration excluded the non-English studies and removed all the duplicates. The exclusion procedure then removed all the citations, patents, conference papers, book chapters, notes, and early access papers. Moreover, all the surveys and reviews in the remaining results were excluded. Finally, the results from top publishers (as shown in Table 4), including Elsevier, IEEE, Springer, Taylor & Francis, and MDPI, were included in this study. In contrast, other studies were excluded resulting in a set of 42 articles fulfilling the scope of research and the adopted criteria of inclusion.

3) SELECTION OF ADDITIONAL STUDIES BY SNOWBALL SAMPLING
This study used a backward and forward snowball sampling process to complement the selection process [46]. The references and citations of the primary studies were analyzed to include more relevant studies in this research. The title and, if required, the full text of the shortlisted studies was examined in order to include any relevant study that we missed in our data accumulation process. Finally, at the end of this stage, the final primary study pool included four additional studies, totaling 46 articles.

D. DATA EXTRACTION 1) STUDY QUALITY ASSESSMENT
In the literature, various researchers have proposed guidelines for the quality assessment in an SLR [43], [47]. However, it is debatable due to the lack of a universally accepted definition of the study quality [48]. Therefore, a checklist for quality assessment is most practical. In this research, we have adopted the guidelines presented in [43] for the quality assessment of the primary studies based on the question and scores shown in Table 5, Table 6, respectively. We evaluated the QA1 based on the proposed algorithm, case study, and analysis. QA2 relies on the detailed methodology, whereas QA3 considered comparing the proposed approach with existing approaches. Finally, QA4 is based on a thorough evaluation of the proposed approach. The score assigned to the studies is shown in Table 7, 8.

2) DATA SYNTHESIS
In order to address RQ1, we adopted a descriptive statistics approach. Furthermore, we evaluated the publication year, source, research, contribution type, and the quality of the selected primary study to answer RQ2 and RQ3, based on the frequency of quantitative descriptions. To identify recurring patterns, various code categories were mapped to different labels related to concepts and findings extracted  using thematic synthesis in NVivo research analysis tool, which enabled us to create different codes to link sentences of references found to achieve a detailed analysis. In order to investigate a specific SCOS approach (such as heuristicbased, single, and multi-objectives), and to identify, classify, and summarize quality attributes, algorithms, and evaluation methods, we used deductive coding in NVivo. Using the inductive synthesis approach, we evaluated and refined the initial code categories to achieve a higher level of reliable categories.
NVivo supports qualitative data analysis based on numerous embedded features; due to its reliability, it is considered a well-established software package. Numerous studies in the field of computer science have reported positive experience while conducting analysis in a systematic literature review [41], [84]. NVivo arranges the data in documents and codes, supporting static categories of data called sets. In our study, the selected primary studies were the documents, whereas codes contained the data of our interest extracted from the primary studies to categorize these based on algorithms, methodology, case study, and various other factors. The selection of NVivo enabled us to import documents (primary studies selected), search, retrieve, code, and review coded textual data to enhance the accuracy of the categorization of articles in coded segments. The principles of coding and numerous other methods resemble the structuring of coded categories in NVivo, as it is a ''method free'' software [41], [85]. A considerable amount of time and effort is required to perform qualitative data analysis on a large  amount of data; therefore, employing NVivo helped in making systematic quality data analysis electronically. It has been noticed by previous research [86], that researchers can save time required to manually code, resulting in the increase of analysis process speed by using NVivo. Furthermore, leveraging electronic search over manual, human error could be reduced and yield more reliable results.

E. ARTICLE CLASSIFICATION
The authors have evaluated the selected primary studies in great detail and have adapted multi-variant classification. The classification of studies primarily has been carried out on the bases of study approach that consists of two categories (validation or comparative). Furthermore, the studies have been classified on the basis of contribution type that includes model (Mo), and framework, method, technique, approach, and scheme (FMTAS). Moreover, the papers with an algorithm, case study, test case, and analysis have been used to classify the studies further, as elaborated in Table 7. In the second phase, the selected primary studies have been classified based on the objective function used in service composition and selection mechanisms. Therefore the 46 selected studies have been categorized into single-objective and multi-objective techniques. It is evident from Figure 4 that 26 out of 46 selected studies are related to single-objective techniques that correspond to approximately 57%, whereas the remaining set is 43%, consisting of 20 primary studies related to multi-objective techniques. The comprehensive investigation of these is presented in the next section.

V. FINDINGS AND DISCUSSIONS A. RESEARCH AND CONTRIBUTION TYPE
The initial classification of primary studies is based on research type and contribution type. The primary studies consist of 57% comparative research, whereas 43% is related to the validation. Furthermore, Figure 5 further expands the classification based on contributions put forward by the selected primary studies. Most of the studies (33%) proposed or extended an algorithm, whereas 19% focused on the FMTAS. Furthermore, 19% of the studies presented case studies to evaluate the proposed technique in a particular scenario. The ratio of models to solve SCOS in CMfg and the articles that adapted analysis is 10% each, and 9% test cases were used to elaborate on the efficiency of the proposed scheme.

B. SERVICE COMPOSITION AND SELECTION TECHNIQUES
Based on the objective function, the 46 selected articles can be distributed into single and multi-objective. Figure 6 elaborates on the annual distribution of single and multi-objective papers. It is evident that since 2017, the average number of articles regarding single and multi-objective techniques published annually is ten, whereas in 2020, so far, six articles have been published. Furthermore, Table 9 distributes both the techniques with publisher, journal, quality, open access as well as impact factor and the reference of each of the studies along with the publication year. The QoS parameters for both the techniques have been generalized into the cost (c), quality (q), reliability (r), time (t), trust (t r ), and usability (u), as elaborated in Figure 7. Table 10 shows the increase and decrease in each QoS index, along with the methodology used in a particular study. The QoS indicators are represented by high (↑) and low (↓) along with color to indicate the benefit (green) and drawback (red) of each approach. From the total 46 selected primary studies, 26 are based on the single-objective technique; it is evident that 15 studies included time as a QoS parameter; from these, 40% have achieved efficient results. The quality aspects have been considered by 14 studies, where 64% achieved higher quality results. Likewise, trust has been evaluated in 8 articles, achieving a 75% positive trust level. The usability has been evaluated in 18 manuscripts with equal positive and negative results. Similarly, 19 studies considered reliability as the QoS parameter and achieved 74% positive results. Finally, amongst all these attributes, the most considered is the cost attribute, which achieved a 95% positive result. On the contrary, a multi-objective technique has been adopted in 20 studies. It was found that the highest selected parameter that was considered in 20 studies is time, having 75% positive results. Only three articles considered quality, with 100% positive results. Trust gained 55% positive results from 11 studies. Whereas usability has only 33% positive results in 18 articles, and 56% reliability increase was accumulated from 16 manuscripts. Finally, the reduced cost secured 100% by achieving positive results in 19 studies. Table 11 elaborates the additional details extracted from the primary study and have been summarized in Figure 8. It is evident from Figure 8(A) that heuristic-based approaches have been adopted more (76%) than non-heuristic approaches (24%). Furthermore, 98% of the methodology is comprehensively explained, whereas the approach used in VOLUME 8, 2020 2% is not so evident, as shown in Figure 8(B). On the other hand, the explicit function in 11% is not evident, as shown in Figure 8(C), whereas 89% of studies have elaborated explicit function. Similarly, Figure 8(D) shows that the fitness function consisting of minimization (20%) and maximization (46%); however, the fitness function type of 35% is not evident. Amongst the selected studies, 87% have QoS constraints, whereas 13% do not, as shown in Figure 8(E). The comparison of methodology with existing approaches has been elaborated in 57%, as shown in Figure 8(F), leaving 43% methods not compared. Furthermore, as elaborated in Figure 8(G), 46% of the studies include case studies, and 54% do not include. Lastly, Figure 8(H) shows that 11% of the studies include penalty function, as compared to 89% that does not.

C. ADDITIONAL DETAILS EXTRACTED FROM THE PRIMARY STUDIES
The following section entails the details of the extracted information:

1) QoS PARAMETERS
The studies selected for this SLR consists of various fitness criteria QoS parameters. Figure 9 shows that cost and execution time are the most selected parameters used in 21% of studies. Reliability secures second having a 14% share, whereas availability was evaluated in 19 studies. Moreover, scalability, reputation, and energy consumption were adapted in 15, 13, and 12 articles, respectively. All other parameters have been used in less than ten articles.

2) ALGORITHMS
Most of the studies employed evolutionary algorithms, specifically genetic algorithms, followed by artificial bee colony. However, many algorithms have been used by only one study, as shown in Table 11. The abbreviations of algorithms are elaborated in Table 12.

3) DATASET
The experimentation carried out in the selected primary studies was based on data, from which 65% was random, 13% was synthetically generated, 2% data was collected, and real, whereas 17% of articles have not used dataset as elaborated in Figure 10.

4) CASE STUDY
The selected primary studies presented case studies to evaluate the proposed approaches in real-world case studies  via simulation or numerical proof. The authors thoroughly evaluated each study, and the extracts of case-studies have been shown in Table 13.

5) SIMULATION TOOLS
Various simulation tools have been used in the selected primary studies to prove the efficiency of the proposed solution. Matlab has been the favorite choice of researchers and has been used by 38% of studies. Visual Studio, Java, Eclipse, and C# share the second place, whereas Python and Gams have been used in one study as elaborated in Figure 11, and details are given in Table 11.

D. OVERVIEW AND IMPLICATIONS OF RESEARCH FINDINGS
RQ1: The information extracted from primary studies was analyzed, and details consisting of the type of research, contribution, quality were evaluated in order to address this question. The findings of this study indicate that research in service composition and optimal selection in the field of CMfg has received increasing attention since 2013, so it is essential to identify the studies with promising solutions to relay a foundation for future research. The results show that the diversified journals from top publishers have disseminated various studies within this domain. The publication  share by Springer is 46%, followed by Taylor & Frances (20%) and Elsevier (17%), whereas a smaller share is of IEEE (11%) and MDPI (6%). Furthermore, more than 75% of studies are contributed by China, and 12% is the contribution of Iran, whereas the share of other countries is lower than 10%. The rapid development of new techniques and methodology based on the growth in the size and complexity of service composition and optimal selection problems to tackle the exponential growth of manufacturing and production on the cloud will come hand in hand with further challenges.  Based on the results accumulated in this study, we assume that the increasing demand for scalability, better interoperability, and to achieve a higher number of QoS parameters consideration will lead towards devising enhanced techniques in the future. As a result, service composition and optimal selection will face further challenges in terms of achieving optimal service selection from an extensive pool of cloud manufacturing service providers based on a broad spectrum of client requirements, which will undoubtedly lead to a new horizon of exciting opportunities for future studies.
RQ2: To answer this question, we used a deductive coding approach to extract algorithms, case studies, methodologies, QoS parameters, fitness functions, and various other aspects related to the SCOS problem in CMfg. We found that the selected primary studies consist of various QoS fitness indexes, including availability, reliability, cost, execution time, computational complexity, scalability, energy consumption, reputation, trust, maintainability, and quality along with other criteria. To appropriately attain the performance of each QoS, we generalized these into six groups consisting of cost, quality, reliability, time, trust, and usability, as elaborated in Figure 7. The distribution of single and multi-objective was used to categorize the studies based on the objective function. We found that most of the works considered the essential QoS comprised of cost, time, reliability, and availability, whereas a limited number of studies considered other parameters. Likewise, from the 46 selected primary studies, a higher number of the papers were based on a single objective; however, twenty studies focused on multi-objective function. The maximum number of objective functions were capped to three parameters. Section V-A defined six QoS indexes consisting of cost, quality, reliability, time, trust, and usability. These primary indexes are further distributed into 24 sub-indexes to enhance the understanding of the QoS parameters used in existing studies. Though the systematic approach adopted and research prospective achieved in this research vary from preceding studies, individual relationships were observed between current SLR and prior studies in the scope of research questions and study categorization, such as in [27], authors provided QoS parameters used by the existing studies. They characterized the studies into single and multi-objectives. As the primary focus of the study relied on objective-function broadly, the authors were unable to establish an in-depth analysis of various QoS parameters. RQ3: We collected and examined primary selected studies focused on the SCOS problem in CMfg in order to address this question. Our investigation found a handful of studies that have considered extended objective function comprising multiple QoS constraints. We believe more SCOS approaches should exist other than what we reported in this study; furthermore, existing literature fails to evaluate the co-existence of various QoS parameters along with primary constraints. Moreover, the QoS parameters are considered individually, and the evaluation of their collective impact on SCOS has not been thoroughly evaluated.
Implications: This SLR provides a systematic synthesis and classification of service composition and optimal selection issue in CMfg. We have selected the primary studies from top-quality journals and examined the contributions in great detail in order to elaborate on various observations in this SLR. This research would serve academia as a support to continue SCOS research in CMfg. Furthermore, the type of research, achieved contributions, and empirical evaluations will aid to fill research gaps, providing state-of-the-art methodologies to the researchers, in order to support novel research in the field of CMfg, specifically to address the issue of service composition and optimal selection.

VI. OPEN ISSUES AND FUTURE RECOMMENDATIONS
The service composition and optimal selection in CMfg are still in the early research phase. Therefore, many opportunities are open to the researchers to investigate and improve SCOS methods, models, algorithms, adopting QoS factors, multi-objective approaches, fitness functions, and a wide variety of other aspects. The findings indicate that this domain is getting more attention due to the growing demand for manufacturing resources. The challenges and gaps identified in this study could be useful for future studies carried out in this domain, are summarized as follows: (1) Service provider interests: In order to help service providers to identify the positive or negative impact of their contributions, studies should consider the service provider interests. (2) Algorithm efficiency: The algorithms and models devised considering the continuous tasks, its constraints, and inventory would enhance accuracy and efficiency.
(3) Resource efficiency: It is essential to evaluate the efficiency of resources and appropriate task de-composition. (4) Dataset: It is essential to devise a standard dataset based on anonymized real data collected in order to achieve accurate results of the efficiency of the methodology and algorithms proposed. (5) QoS Semantics: It could be utilized to achieve the efficient composition of services based on intelligent algorithms in a big data environment with QoS representation of service providers. (6) Fitness function: Research on QoS-aware web service composition based on efficient multi-objective service composition algorithm and fitness function regardless of composition schema is needed. (7) Extending existing approaches: Multi-task service composition and scheduling techniques proposed in the literature could be extended to tackle a wide variety of resource allocation and scheduling issues. (8) Large-scale SCOS Problems: Parallel computation can be used to tackle large-scale SCOS problems, employing evolutionary algorithms that have achieved noticeable results in the existing literature. (9) Multi-objective approaches: Existing multi-objective techniques have maximum consideration of three objectives; it is essential to extend the scope by including more objectives in future results. Moreover, service impact should be evaluated by considering multiple objectives collectively. (10) Dynamicity and maintainability: Taking into account the dynamics of various factors of quality, correlation, resources, and services is essential to complement the dynamic nature of real-world problems in the field of SCOS in CMfg. Furthermore, the service composition approach should consider the custom requirements dynamicity, as well as a maintenance factor of the service provider that might lead to interruptions or unavailability of services, should be considered to device appropriate service composition and optimal selection solutions based on selective, parallel and loop structures of the sub-tasks.

VII. LIMITATIONS (THREATS TO VALIDITY AND MITIGATION STRATEGIES)
In this study, the authors have tried their best to conduct the systematic literature review as meticulously as conceivable, and therefore, have adopted various strategies in order to minimize the potential validity threat effects while interpreting the research finding with careful considerations. However, it is possible that some threats to the validity may still exist. In this section, we elaborate on the strategies adopted in order to minimize the effects of several potential validity threats that were carefully considered while interpreting the findings of this research.

A. RESEARCH SCOPE 1) IDENTIFICATION AND SELECTION OF PRIMARY STUDY
The identification and selection of primary studies, along with data extraction, is one of the critical threats to the validity of results. It is essential to include a broad spectrum of relevant studies within the scope as possible [44]. However, this is a challenging task. In order to identify maximum possible relevant studies, authors employed a systematic search strategy with an iterative approach leveraging ten bibliographic data sources (see Section IV-B), in order to reduce the risk of ignoring relevant studies and to mitigate this threat by following the widely evaluated and accepted guidelines and search strategies used in academic publications. In order to accumulate an appropriate number of relevant studies, a search strategy was devised based on the pilot study that included multiple iterations of experimental search. As established in Section IV-D1, authors have tried the best to evaluate the studies with minimal chances of subjective evaluation, misinterpretation, and bias.

2) IDENTIFICATION AND SYNTHESIS OF ALL RELEVANT STUDIES
An SLR aims to include all relevant research in a field of interest; however, it is evident from the literature that identification and synthesis of all possibly relevant studies are somewhat unlikely [45], [87], but it is likely to attain a good sample of relevant research articles [87]. In this research, our objective was to device a research strategy that could include as many relevant studies as possible for the primary study while keeping the selection process with minimal noise. In order to map studies conducted in this domain based on relevant literature reviews and research questions, we constructed a search string that included all top-quality journal publications from 2013 to 30/06/2020. Although authors have taken additional steps such as backward and forward snowball sampling (see Section IV-C3) to include more relevant studies but still the possibility of missing relevant studies cannot be ruled out.

B. RESULTS VALIDITY 1) PUBLICATION BIAS
To avoid the bias problem having trivial effects of threats that arise from considering positive results as a publication opportunity while neglecting negative results in comparison of methods and techniques [47], [88]. Therefore, the authors have not included any comparisons for the approaches, methodologies, algorithms, simulations tools, and case studies.

2) VALIDITY OF RESEARCH RESULTS
One of the potential threats to the validity of research, based on the primary study data extraction and interpretation, is the researchers' bias. The chance to device precise query to identify relevant information from the text is relatively less; VOLUME 8, 2020 therefore, to mitigate this threat, and avoid the laborious task of manual qualitative data analysis for the primary studies as well as to achieve higher data extraction accuracy in a fraction of time, we leveraged NVivo for the analysis as elaborated in Section IV-D2.

3) RELIABILITY OF RESEARCH RESULTS
To ensure the replicability of research results [89], which is required to assure reliability, we thoroughly documented the protocol adopted in this review that includes precisely implemented steps, bibliographic data sources, and search strings, as mentioned in Section IV in great detail. Even though authors have adapted various ways to mitigate the effects of this threat, however, there is a possibility that several relevant research articles may have been overlooked due to numerous causes, such as the selection of inappropriate keywords that might lead to a different sample of results for the systematic literature review in this research.

VIII. CONCLUSION
This research presents a systematic literature review by thoroughly investigating current and state-of-the-art studies. Authors have selected multiple search sources to include a variety of research that might not have been possible by a single source. Moreover, the research is restricted to prefer top publishers, including Elsevier, Hindawi, IEEE, MDPI, Springer, and Taylor & Francis. The cumulative search in the first phase resulted in 5872 results. After applying various filtration and snowball sampling process, 46 articles were selected as the primary study for a comprehensive investigation. The evaluation of results concluded that the maximum number of papers published was in 2018 (11), whereas in 2015, only one paper was published. However, based on the trends of 2017, 2018, and 2019 it seems the publication count increase in this field is stable. Springer has the highest publication share (46%), and MDPI has only a 6% share. The International Journal of Advanced Manufacturing and Technology is the well-reputed journal with most publications (13), whereas various journals were identified, consisting of only one publication. The articles are distributed primarily on the basis of research type, consisting of validation (20) and comparative study (26) articles. A thorough review was conducted on each study to achieve a meticulous and systematic inspection resulting in the identification of various aspects presented. It was found that 35 studies proposed algorithms, whereas 20 studies have established case studies, ten studies have performed analysis, the model is proposed in 11 studies, and 20 studies are based on FMTAS. Furthermore, a total of 46 selected primary studies have been categorized on the basis of the objective function. It was identified that 20 studies are related to multi-objective, whereas most studies are related to a single objective. Considering the QoS parameters of single objective studies, 15 studies included time, whereas 14 studies evaluated the quality parameter. Eight manuscripts evaluated trust, and 18 include usability, whereas 19 studies have considered reliability, and 20 studies evaluated cost. As far as the multi-objective case studies are concerned, 20 studies considered time, three studies considered quality, trust was evaluated by 11 studies, usability by 18, reliability was evaluated by 16, and 19 studies considered the cost. However, 95% of studies in both categorizes achieved positive results for the cost parameter. Each of the studies was assigned scores on the bases of quality assessment parameters that contributed towards annual mean and standard deviation, resulting in an average score is approximately 2.66 out of 4. Heuristic-based approaches have been adopted by 76% of the studies. The methodology was comprehensively elaborated in 98% of the studies. The explicit function has been explained in 89% of studies. Similarly, the fitness function is distributed in maximization (46%) and minimization (20%), whereas remaining studies have failed to describe the fitness function comprehensibly. The comparison of the proposed approach with existing studies has been put forward by 57% of studies. Whereas 54% of the studies have not included case studies, and 89% does not include the penalty function. Furthermore, the maximization and minimization of fitness function and comparison of the proposed approach with existing approaches along with case studies, and penalty function details have been elaborated. Due to the non-existence of the standard research dataset, 83% of the studies that used dataset for experimentation, most of the data was randomly generated, whereas only 2% was collected and real. Out of 46 primary selected studies, 22 studies have elaborated case studies. Matlab is the simulation tool of choice by the majority of the researchers, whereas Python, Java, and Gams are least used.
In this study, the authors have comprehensively reviewed the state-of-the-art studies and elaborated the benefits and drawbacks of the methodologies adopted by the studies, and have also explored the QoS parameters, fitness function, algorithms, dataset, simulation tools, and various other co-relations between numerous studies to lay a foundation and a roadmap for future research in this field. Authors believe that the results achieved in this study would enrich upcoming researchers to embark on new paradigms in this field of research meritoriously. DESHUN LIU received the Ph.D. degree. He has served in various positions such as the President and the Vice-Chairman of the Hunan University of Science and Technology and an Academic Visitor with the University of Missouri-Rolla. He is currently a Professor and a Postdoctoral Supervisor. He is also the Chairman of the Hunan University of Science and Technology. He has undertaken more than 20 projects at the national and provincial levels. He has published more than 100 papers in key journals and international conferences, including ASME, Journal of Mechanical Engineering, and Mechanism and Machine Theory. He has authored five monographs. His research interests include mining machinery and mechanical system dynamics. He was a winner of the State Council's Special Allowance. He is also a member of the Editorial Board of leading journals, such as the Journal of China Coal Society and Journal of Mechanical Engineering.
DONG ZHOU received the Ph.D. degree from the University of Nottingham, U.K., in 2009. He has worked as a Research Fellow with the Centre for Next Generation Localization, Trinity College Dublin, Ireland, from 2008 to 2012. He is currently a Professor with the School of Computer Science and Engineering, Hunan University of Science and Technology, China. His current research interests include information retrieval, natural language processing, machine learning, and data mining.
GUOJUN WANG (Member, IEEE) received the B.Sc. degree in geophysics and the M.Sc. and Ph.D. degrees in computer science from Central South University, China, in 1992, 1996, and 2002, respectively. He was a Professor with Central South University, an Adjunct Professor with Temple University, USA, a Visiting Scholar with Florida Atlantic University, USA, a Visiting Researcher with The University of Aizu, Japan, and a Research Fellow with The Hong Kong Polytechnic University, Hong Kong. He is currently a Pearl River Scholarship Distinguished Professor of higher education, Guangdong. He is also a Ph.D. Supervisor with the School of Computer Science and Cyber Engineering, Guangzhou University, China. His research interests include artificial intelligence, big data, cloud computing, mobile computing, trustworthy/dependable computing, cyberspace security, recommendation systems, and mobile healthcare systems. VOLUME 8, 2020