Model-Driven Engineering Tools and Languages for Cyber-Physical Systems–A Systematic Literature Review

The development of Cyber-physical Systems (CPS) draws more interest from both researchers and industrial practitioners considering the opportunities they offer in almost all areas of industry. However, the engineering and management of CPS are challenging tasks due to their inherent heterogeneity and complexity characteristics. Regarding the development of CPS, there currently exists no standard methodology owing to the complexity of the domain. One of the key approaches to reduce the development complexity for CPS is Model-driven Engineering (MDE), which is frequently used in many industrial domains for software development to increase the level of platform abstraction. Nevertheless, it is always almost challenging, especially for the new researchers in this field, to determine the appropriate tools and languages to perform a particular MDE activity during CPS development. To the best of our knowledge, there is no guideline that demonstrates which language(s)/tool(s) to use for the various MDE techniques/phases for the development of CPS. This paper presents a Systematic Literature Review (SLR) study that focuses on identifying and classifying the recent research practices pertaining to CPS development by applying MDE approaches. With the objective of providing a general overview of the field, the study evaluates 140 research papers published during 2010–2018. Accordingly, a precise view of the various MDE tools and languages used in the development life-cycle of CPS, addressed MDE techniques/activities, and targeted CPS components is presented. We believe that the conducted study will guide researchers and practitioners to identify appropriate tools and languages according to the system requirements. It may also help in getting an overall understanding of the research trends for further research and development on the MDE of CPS.


I. INTRODUCTION
Significant challenges come across developers of Cyberphysical systems (CPS) due to their heterogeneous nature, such as the need for knowledge and skills from multiple academic disciplines, the integration of the artifacts of these disciplines, and the difficulty of the maintenance activities of such heterogeneous artifacts. In order to address these challenges and reduce the complexity of CPS development, one of the key approaches is Model-driven Engineering (MDE), The associate editor coordinating the review of this manuscript and approving it for publication was Claudia Raibulet .
which is frequently used in many industrial domains for software development [1].
The abundance of different hardware platforms available for CPS makes their development complex [2]- [4]. There is a need for a methodology that permits efficient raise of the abstraction level to overcome issues of heterogeneity induced by the multidisciplinary nature of such systems. Towards this goal, many researchers believe that MDE is a better alternative solution to overcome challenges such as development complexity, heterogeneity, adaptability, and reuse and they propose various applications of MDE for CPS development (e.g. [5]- [9]). However, to the best of our knowledge, no secondary study, highlighting both previous researches and current research efforts related to applying MDE approaches for CPS, has been performed yet. This overview would be helpful to both researchers and industrial practitioners for discovering the pros and cons for applying MDE in CPS and for identifying interesting research directions. Without such a secondary study, it may be cumbersome to determine what was proposed, what has been successfully completed and what rather has failed.
In this paper, we present a Systematic Literature Review (SLR) of the studies which used MDE techniques such as domain-specific modeling, metamodeling, validation & verification, model transformation, simulation, code generation and analysis for CPS. An evaluation within the context of a set of research questions and analysis of the proposed toolchains in the primary studies is performed within the scope of this study. As a result, numerous tools and languages have been identified for modeling, model transformation, simulation, code generation, analysis, verification and validation activities for CPS.
The aim of this study is not to compare any of the languages/tools developed by/used in the primary studies for the MDE of CPS since such comparisons are beyond the scope of the SLR studies. Our aim is rather to report on the languages/tools developed/used in the primary studies and present them to the readers in a systematic manner, e.g. by identifying meaningful correlations between the results achieved from the evaluation of the introduced research questions which is the objective of the SLR studies [10].
The results of this study may help the researchers and practitioners to easily reach the desired class of studies and related publications considering the tools and languages, technologies, and best practices used. This study also enables researchers avoid unnecessary duplication of the trials and errors which may encounter during MDE of CPS. Finally, it identifies and determines languages and tools in each and every MDE technique/phase as well as the addressed CPS components.
The remainder of the paper is organized as follows: A brief introduction to CPS and MDE are given in section 2. Section 3 discusses the related work. Section 4 describes the research methodology and protocol definition to carry out the SLR. Section 5 shows the achieved results. The discussion of the results and the conclusions are presented in sections 6 and 7 respectively.

II. BACKGROUND
In the following subsections CPS and MDE with including their structures and construction phases are briefly discussed before giving the details of the conducted SLR.
CPSs are realtime, distributed feedback systems, interconnecting embedded systems, networking systems, physical systems and human users [15], where a centralized system monitors and controls the physical process(es), mostly with feedback loops [16]. Examples of CPS include smart factories, Industry 4.0/Cyber-physical production systems (CPPS), Unmanned Aerial Vehicles, Safety-critical systems, and smart cities.
In contrast to the conventional embedded systems, a comprehensive CPS is usually conceived as a network of interacting elements with physical input and output rather than as standalone devices. While CPSs tend to emphasize the interconnection between the cyber/computational and physical elements, embedded systems happen to focus on the computational elements and they are often referred to as ''closed boxes'' due to the fact that these systems do not expose their computational capabilities to the outside [16]. A CPS may consist of a group of interconnected embedded systems, sometimes functioning as the main brain of the whole system, with sensors and actuators. Therefore, one can deduce that embedded systems compose a subset of CPS.

B. MODEL-DRIVEN ENGINEERING
MDE advocates for the use of models as the backbone of the software development process, with the aim of automating the process from design stage to implementation stage. It tends to raise the abstraction level from computing concepts to an abstraction closer to the understanding of both the developers and the other stakeholders [17]. Models are used in MDE as the basic building blocks for the development of software artifacts. MDE paradigm raises the abstraction level of software/system development from low-level artifacts to a higher-level of models and bridges the gap between problem identification and software implementation phases. This can be done by thoroughly/partially generating either software implementations (C++, Java, and C#) or deployment artifacts (such as XML-based configuration) from models that describe the system at multiple levels of abstraction, and from a variety of perspectives [1], [18], [19].
Like any other paradigm, MDE has pros and cons [19]. Taking into consideration its pros, for instance, it helps to reduce the time/effort required for the system/software development. As the code is partially generated automatically, the task of the developers is to complete the parts that he/she intended to generate automatically or could not be generated by the used tool. Moreover, MDE helps increasing the quality, productivity and maintainability of the final artifacts. With regard to the cons, one of the major drawbacks of MDE could be that more time is to be spent on system analysis in order to correctly model the system and therefore generate the correct and optimized code. Researchers and practitioners may also face with various MDE challenges within the aspects of e.g. its sometimes uncertain and bidirectional foundation, complexity of the application domains and lack of good tooling [20].
Taking a glimpse at the literature, it is observable that MDE is used for the various software development lifecycle (SDLC) activities [19], which is to say, it is possible to support different SDLC activities using MDE: • System design: designing is the main activity of MDE, in which a fully functional DSL and its tool are used. This may consist of more than one MDE technique, such as modeling, transformation and code generation.
• Modeling: Presentation of a graphical or textual representation of the system that may be used for tasks such as modeling system features, functionality or behavior, and modeling time constraints or system threats, etc.
• Transformations: transformations are regarded as an integral part of MDE as it helps developers define mappings between the models.
• Code generation: the aim here is to generate an architectural code out of the designed models. The code is generated from the designed models by means of modelto-text transformation or rule-based template engine.
• Validation and Verification (V&V): This could be seen as testing or analyzing MDE techniques, such as the verification of the designed models, the analysis of the accuracy of the transformations and their rules, and also the validation of the final artifacts. Ensuring that the right system is built is called validation. One important method for validation is simulation of system behavior. However, assuring that the system is behaving according to the specifications is called verification. There exist various methods of verifying a system in MDE such as requirement analysis, performance analysis, failure analysis, formal methods and model checking [21].
• Simulation: Simulation can be done using generalpurpose languages (such as Python, C++, Java, etc) or simulation specific programming languages like SIM-ULA, in which both use a compiler or interpreter to run the simulation. However, utilizing MDE tools and techniques like modeling and model transformation will make performing simulations cost-effective and effort-saving [22].
• Requirement analysis: Requirement engineering/ analysis (RE) combines tasks such as defining, modeling, validating, and reaching agreement on the system requirements. As suggested by [23], MDE can be used as a technical solution for requirement analysis, as it is possible with MDE to integrate the tasks of RE.
MDE can also be further used for system analysis activities e.g. safety analysis, security analysis, operational analysis and dependability analysis.

III. RELATED WORK
Since the scope of this study is to present an SLR on the stateof-the-art of MDE tools and languages for CPS, the related secondary studies (surveys, systematic mapping (SM), SLR, and Tertiary Studies) are addressed in this section as the related work. Although, to the best of our knowledge, there VOLUME 9, 2021 is no secondary study exactly on this topic, the following studies are relevant to the conducted SLR. Table 1 presents a summary of these related work.
In [24], an SLR on multi-paradigm modeling for cyber-physical systems is presented, where the objective of the study is to investigate studies promoting multi-modeling, multi-view, and multi-formalism approaches for the development of CPS. 265 research papers published during the period 2006-2017 are evaluated. The study reported used approaches and tools for multi-paradigm modeling as well as indicating the type of formalism presented, and which tool and/or language is used for implementing it. Their focus is mainly addressing the domain distribution and generality of the approaches. Furthermore, they report on the actors and stakeholders involved in the modeling process and their background knowledge. On the other hand, the objective of our study is mainly to investigate the reported MDE approaches and tools used for the development of CPS. This is to say when discussing modeling approaches in our study, we focus on the presented MDE activities/techniques (i.e. metamodeling, model transformation, code generation, analysis, etc.) and the languages/tools used for each of these MDE activities/techniques, as well as presenting the addressed CPS components which are not taken into account in [24].
The authors of [25] conducted SLR of the development of embedded systems using model-based system engineering (MBSE) approach. The study reviewed 61 research papers published in one of the four renowned scientific databases (IEEE, SPRINGER, ELSEVIER, and ACM) during the years 2008-2014. Subsequently, primary studies are grouped into six categories according to their relevance to the corresponding MBSE activity namely general category, modeling category, model transformation category, model verification category, simulation category, and property specification category. As a result, the study presents 28 tools which support modeling, model transformation, validation, and verification activities. The study examined the utilization of UML and SysML/MARTE profiles, and it also analyzed the application of both model-to-model and model-to-text transformations.
The SLR study in [33] covered 52 studies published during 2008-2015, with the objective of investigating the research trends in the representation of constraints and use or customization of UML together with its SysML/MARTE profiles for requirement specification of embedded systems. The study thus looked at the practices of UML profiles/diagrams for design specifications, and the trends to specify properties/constraints for design verification. As a result, the study found that the combination of UML and MARTE profiles is the major trend, that the class/block diagram is commonly used to specify structural aspects, whereas the state machine/activity diagram is commonly used to specify behavioral aspects. While the main focus of the work in [33] is to investigate how UML and its profiles SysML/MARTE where utilized for requirement specification of embedded systems, in our work, we aim at reporting the various MDE tools and languages used in the development life-cycle of CPS, addressed MDE techniques/activities, and targeted CPS components.
Another SLR is presented in [26] where the authors investigate studies combining Product Line Engineering (PLE) and MDE for the development of safety-critical embedded systems. This study further examined whether there are empirical studies applying the aforementioned techniques in the development process of safety-critical embedded systems. The study exposes that in recent years, use of MDE combined with PLE techniques to build safety-critical embedded systems is gradually growing. The study also states that the proposed approaches in the primary studies are not compared with any other related studies, besides, these approaches do not explicitly differentiate between the software and hardware variabilities.
The study in [34] concentrated mainly on the security and safety aspects of CPS, where the authors conducted an SLR study to identify the existing model-driven approaches for secure CPS that specifically cover both software and platform description. As a result, the analysis found out 17 model-driven security approaches, 7 of which were specific to CPS and were investigated more closely.
An SM study is presented in [27]. This study investigates the implementations of MDE in the field of mobile robot systems (MRS). In this study, 69 research papers were selected, and as a result, the authors found out that many domain-specific modeling languages (DSMLs) are supported with tools which are mostly built ad-hoc. Also, they reported that the solutions based on UML and using Eclipse-based tools were less preferred in this field.
A survey, presented in [28], collected the quantitative data from 113 subjects to provide the current state-of-practice (SoP) and challenges faced by the domain of embedded systems due to weaknesses in model-based engineering (MBE). The survey has two research questions, the first question is related to capturing the state of MBE practice in the embedded systems domain, how much activities concern MBE compared to non-MBE, and understanding the pros and cons of adopting and deploying MBE. The second question is about estimating whether there are important variations in the SoP between different groups in the embedded systems domain. As a result, the study provides information about the used methods and tools, purposes of models, effects of using it, and weaknesses of MBE. Furthermore, answers to the survey shown that most of the participants believe the positive outcomes of MBE distinctly exceeded the negative outcomes. Nonetheless, survey participants mentioned weaknesses such as the interoperability challenges amongst existing tools, and high efforts to train the developers for MDE.
Another survey is presented in [29]. The study introduces statistical findings about the use of UML modeling and model-driven approaches for the design of embedded software in Brazil. The goal of this study is to identify gaps in the knowledge of how exactly UML and MDE or Model-driven Architecture (MDA) are used in industry, and to provide an understanding of how social and organizational factors impact the use of UML and MDE/MDA. Model-driven techniques and tools for CPS were surveyed in [35] including the provided features and solutions to the challenges. However, the given analysis is limited to evaluating only 10 specific representative techniques and tools.
The survey in [30] focused on summarizing and classifying the many state-of-the-art co-simulation approaches. The survey covers the key challenges in enabling co-simulation. Initially, two major co-simulation domains were addressed separately, namely continuous-time and discrete-event-based co-simulation. Then, the challenges which occur when the two domains are merged were discussed. The proposed taxonomy is used for the classification of the works related to cosimulation. 84 papers published during the period 2011-2016 were read and classified.
The work in [31] carried out a survey on the industrial Internet of Things (I-IoT), from a CPS viewpoint, concentrating mainly on control, networking, and computing systems perspectives. The contribution of this research revolves primarily around three points: Firstly, presenting I-IoT architecture and identifying I-IoT performance requirements from the communications point of view (e.g. latency and reliability). Secondly, conducting an I-IoT survey from the three critical perspectives referred to above. Finally, presenting the challenges and future research of control, networking and computing systems in the I-IoT. Similarly, in [32], a survey in CPS is carried out, in order to summarize CPS features. The study gives the research progress from different perspectives like energy control, security control, transmission and management, control technique, system resource allocation, and model-based software design. Subsequently, the study incorporates three classic applications so as to prove that the prospects of CPSs are alluring. Ultimately, the study demonstrates research challenges and outlines some suggestions for future work.
Furthermore, [36] discussed CPS challenges. The authors discussed the maturation of embedded systems into CPS and the design and operating challenges that resulted from it. The paper argues that the maturation of embedded systems has been growing by leap and bound as the networking technology became available. The research focused primarily on the functional aspects of CPS. Authors linked general software modeling challenges to the specific challenges of CPSs. The study posed many challenges related to maintainability, dependability, security, resilience, certification and policy. Similarly, we addressed the challenges of CPS in an accompanying study [37], where our objective was to report the approaches followed when applying MDE for CPS, application domains, presented case studies and addressed CPS challenges by conducting a systematic mapping study. Results related to the challenges addressed by the examined studies have shown that numerous CPS challenges were addressed, namely, complexity, dependability, flexibility, interoperability, latency, predictability, reliability, remote monitoring, security and sustainability. Complexity and interoperability seems to be the most addressed.
Unlike the work presented in [25], [26], [35] for embedded systems only, our study focuses on conducting an SLR on the primary studies concerning the development of CPS using the MDE paradigm/approach. Our SLR also differentiates from the above mentioned studies [30]- [32], [34] especially within the context of the application domain. As briefly discussed above, the work in [30] focused on co-simulation approaches regardless of the specific domains of the system, i.e. it provided a remarkable classification of the studies using continuous-time and discrete-event-based co-simulations however application domains such as CPS, IoT, Embedded system and Industry 4.0 are not in the scope of the work. Finally, the surveys in [31], [32] addressed the I-IoT and CPS domains, respectively. However, they did not consider any development paradigm/approach applied for the construction of these systems while we present an SLR of the tools/techniques used during the MDE of CPS.

IV. METHODOLOGY
SLR is a widely adopted approach when conducting a comprehensive review relevant to specific research questions. Several studies present guidelines [10], [38], [39] and approaches [40] to be followed for SLR studies.  In this section, the applied methodology for the conducted SLR is discussed. Initially, the process followed during this work is described. Then, research questions are defined and primary study selection strategy is discussed. This is followed by specifying inclusion and exclusion criteria in addition to the quality assessment and self-assessment criteria, and finally determining the data extraction procedure.

A. PROCESS
We developed the procedure of the systematic review in this study by following the guidelines defined in [10]. Figure 2 shows an overview of the followed process. The SLR composes three main phases; review planning, review execution, and review reporting [38]. The protocol and research questions of the SLR are prepared in the planning phase while collecting the studies, extracting data from them and analyzing these data are performed in the executing phase. Finally, reporting the achieved results takes place. In every ''Evaluation'' process shown in Figure 2, all researchers participating in the SLR decide whether or not the previous SLR process has been successfully completed. If yes, which is denoted as ''Passed'' in the diagram, the researchers move to the next process; otherwise, which is denoted as ''Failed'' in the diagram, the previous process will be revisited until all reviewers agree.

B. RESEARCH QUESTIONS
Research questions were identified by following the PICOC criteria [10] (see Table 2). The research questions of this study are determined as below: • RQ1: Which activity(s) of the system development is/are addressed in the study (using MDE)? Motivation: The motivation behind the inclusion of this research question is to report on the different activities of which MDE techniques have been applied in primary studies. The results for this question are expected to retrieve the information on the extent to which the investigated paper covered MDE activities/techniques, e.g. modeling, model transformation, simulation, code generation, analysis, verification and validation. As a result, the outcome of this question will help readers to identify which studies applied which MDE activities/techniques.
• RQ2: Is/Are there any tool and/or DSL developed for MDE of CPS by the study? Motivation: The objective of this question is to find out the tool(s) developed by the investigated studies, their availability, and the used languages while developing these tools. The results of this question will help readers identifying recently developed tools as well as the ongoing work in this field. Moreover, the results of this question are to be further correlated with RQ1 results. The In addition, the results of this correlation will help readers Identifying where the current research is more condensed, and easily reaches the desired set of studies. Results of this correlation are shown in Table 13 C

. SEARCH AND SELECTION STRATEGY
This stage can be considered as one of the most important and critical stages when conducting a secondary study (i.e. in this case, SLR). Therefore, it should be carefully defined since the search of primary studies should ensure the comprehensive coverage of the topic under consideration. For a search strategy to be optimal, it needs to simultaneously include utmost relevant primary studies (recall) and exclude irrelevant ones (precision). One can deduce that an optimal search strategy must have 100% recall and/or 100% precision. Nevertheless, it is unlikely that a search strategy gives 100% in both/either recall and/or precision. Accordingly, one should come up with a gratifying trade-off search strategy (i.e. good enough), that results in not many relevant studies missed, and a manageable quantity of irrelevant studies included [41]. The search strategy, developed in this study, composes four stages. Firstly, an automatic search over the most relevant scientific digital libraries was performed. Secondly, all duplicate papers were removed. Thirdly, following predetermined criteria of inclusion, only papers related to the topic were considered. Eventually, further studies were searched by forward snowballing [42] (see Section IV-C4 for the detailed description of the forward snowballing). The composition of the search and selection strategy followed in this work is shown in Figure 3.
It is worth indicating that due to the ambiguity between the terms such as CPS, Internet of Things (IoT), embedded systems and self-adapting systems, considering all these terms and their variations in the search keyword of an SLR, will result in an immense number of studies, ultimately leading to a lack of the credibility, validity, and reproducibility of this study and making the study of an undefined scope. It is therefore decided to consider only those studies that identify themselves as CPS-related studies in this SLR. This means that if a primary study contains any of the keywords (i.e. ''cyber-physical system*'' OR ''cyber physical system*'' OR ''smart system*'' OR ''cyberphysical systems'' OR ''cps'') in its title, abstract, introduction or conclusion, it will be included in our initial pool of studies.

1) PERFORMING AUTOMATIC SEARCH
One of the key issues with SLR studies is getting all the relevant studies on the examined topic [43]. To get as many related primary studies as possible, an automatic search was performed on the following digital libraries: ACM, IEEE Xplore, ScienceDirect, DBLP, Scopus and Web of Science. The initial pool included all indexed studies found in the aforementioned digital libraries, using predefined search keywords.
Initially, digital libraries, were manually searched. However, a massive number of studies (on average, above 5000 results from each search engine) were achieved. Therefore, it is decided to perform an automatic search as it is advised during conducting SLR in [10]. PICOC criteria [10] were used to define the keywords, shown in Table 2, which leads to form ''good enough'' search strings. Table 3 shows the searched digital libraries and the corresponding search string(s) used. After completing the automatic search, 646 studies were obtained.
Many challenges were encountered while using the digital libraries: One of the main challenges of using these digital libraries is the lack of guidelines explaining how to use the advanced search features of these digital libraries. Also, the number of allowed terms of the search string is limited in digital libraries like ScienceDirect and DBLP, which causes splitting the search string into multiple search strings. Also, wild cards are not supported in ScienceDirect. Another challenge is that the digital libraries like ACM and IEEE do not provide the capability to restrict the search to more than one specific area at once, e.g. title, abstract, and keywords combined.

2) REMOVING DUPLICATE STUDIES
Initially, the pool of primary studies was kept in Mendeley reference manager. 1 Mendeley was used to facilitate the process of determining duplicate studies. The process of duplicate checking goes until further stages (i.e., forward snowballing). The eliminated duplicate papers were 113 studies, so, 533 studies remained. Two papers are considered as duplicate if: • their title, author(s), publication date and venue are the same. In case of different versions of the same paper, the most recent is kept.
• the same paper is published in different venues, one of them is selected (the most recent).
• the same study has both journal and conference publications, the journal publication is considered as it contains the extended study and provides more information.

3) SELECTING PRIMARY STUDIES
In this stage, primary studies are selected following predefined inclusion and exclusion criteria (see Section IV-D).
Only those studies matching the criteria are included in the final pool of the research. The criteria were applied considering the reading of Title, Abstract, Keywords, and Introduction sections, however, if it is not enough for reaching a decision, other parts like Methodology and Conclusion are considered. The process of selecting primary studies is shown in Figure 4. The inclusion or exclusion of studies are performed in several iterations: • Iteration 1: The primary reviewer went through each study reading its title and abstract, and checking the general content (figure, models, tables, etc.). Studies which meet the inclusion and exclusion criteria passed to the next iteration (278 studies were removed in this iteration).
• Iteration 2: All the studies which passed iteration 1 were read in more detail by further reading the related paper's introduction and conclusion sections and if necessary other sections (e.g. methodology and case study). This iteration resulted in including 88 papers and excluding 82 papers. 85 papers left undecided ''to be reviewed''.
• Iteration 3: The 85 undecided papers in iteration 2 were again reviewed with a secondary reviewer. In this stage, both reviewers agreed on either including or excluding the paper. As a result, 34 papers were later included, whereas 51 papers were later excluded. To sum up, 88 papers were included from iteration 2 and 34 papers were included from iteration 3, forming a pool of 122 primary studies.

4) FORWARD SNOWBALLING
To assure that no potential primary studies get ignored, studies that might not have been reached on the basis of automatic searching were also searched. It is critical to obtain a good  sample of primary studies [44], [45] and various approaches including snowballing [42], quasi-gold standard [46], random sampling and margin of error [47] exist to facilitate the identification of the related primary studies. It is also possible to combine these approaches. Conforming to the snowballing guidelines given in [42], the forward snowballing process was VOLUME 9, 2021 accomplished in this study by determining other papers citing any of the primary studies. We used Google Scholar to find those studies.
Forward snowballing was conducted during the study selection phase. Two iterations during forward snowballing were performed. In the first iteration, we obtained 15 studies after applying the criteria for inclusion and exclusion, and removing the duplicates. Then we made the second iteration on the studies obtained at the end of the first iteration. After applying the same process with the first iteration, the second iteration produced 3 new studies. This resulted in the inclusion of 18 papers to the pool of the primary studies, raising the total of primary studies to 140 papers.

D. INCLUSION & EXCLUSION CRITERIA (SELECTION CRITERIA)
Once all potentially relevant papers are gathered, their relevance must be assessed. Selection criteria are intended for the purpose of identifying those papers (primary studies) directly related to the research questions as suggested in [10]. The inclusion and exclusion criteria must be based on the research questions. These criteria are applied when selecting the primary studies and when performing forward snowballing. To reduce the potentiality of a bias to occur, these criteria should be documented in the protocol definition stage. The selection criteria might be revised during the search process. Inclusion and exclusion criteria are applied to a paper/study by reading sections like title, abstract, introduction, and conclusion.
• a paper is included in the primary studies pool only if it meets all the inclusion criteria and none of the exclusion criteria.

1) INCLUSION CRITERIA
• IC1: Study must propose at least one of the MDE approaches or techniques for CPS.
• IC2: Study must target CPS or at least one of its application domains.
• IC3: Study must be peer-reviewed journal papers, workshop papers or conference papers.
• IC4: Models presented by the study must not be used only for documentation and design purposes.
• IC5: Paper publication period must be between 2010 and 2018.
• IC6: Study must be available in full-text in any of the predefined digital libraries.
• EC2: Study is irrelevant to CPS or any of its application domains and the field of software engineering.
• EC3: The study is a summarized version of a complete work already in the SLR pool.
• EC4: Study is a kind of educational, editorial, tutorial, or other material (i.e., not a scientific paper).
• EC5: Study was written in other languages than English.

E. DATA EXTRACTION
Initially, the final pool of the primary studies is stored in Mendeley. Next, a Google sheet is used for the data extraction stage. The final version of the extracted data is available on IEEE Data Port. 2 In the data extraction sheet, research questions are represented in columns, whereas, primary studies are presented in rows. The process of data extraction in this study goes through 3 phases. Data extraction form is shown in Table 4.
• Phase 1: The primary reviewer starts extracting data from the primary studies (answering research questions). Extracted data for each study is represented in a row where each row has a key that refers to the study in Mendeley. Data extraction of each paper is followed by answering quality and self-assessment questions.
• Phase 2: The secondary reviewer starts reviewing primary studies with self-assessment score below 50%.
After evaluating the study, if the secondary reviewer agrees with the answers given by the primary reviewer, the study is marked as agreed on, else, it goes through phase 3.
• Phase 3: In this phase, primary and secondary reviewers discuss the paper disagreed upon in an effort to reach a common ground. More details on the followed methodology and the analysis of the results can be found in our technical report. 3 It is worth indicating that the technical report investigates all the studies addressing MDE for CPS in a broader perspective. However, the present paper focuses specifically on the tool and language selection in MDE for CPS. To this end, the research questions introduced in this paper take into consideration to obtain findings on MDE activites covered in the studies (RQ1), tools and languages used in MDE for CPS (RQ2), tools and languages developed to apply MDE for CPS (RQ3), and the CPS components addressed using these tools and languages (RQ4).

V. RESULTS
In this section, the research questions are analysed, so the findings obtained according to these questions are reported. Firstly, analysis of RQ1 and a correlation analysis of RQ1 with RQ2 and RQ3 are presented in section V-A, followed by answering RQ2 and RQ3 in section V-C and V-B respectively, and finally the analysis of RQ4 is presented in section V-D.

A. MDE ACTIVITIES
In this section, in addition to answering RQ1: Which phase(s) of the system development is/are addressed in the study (using MDE)?, a correlation analysis of RQ1 with RQ2 and RQ3 is carried out to find out the used or developed tools and languages in each of the MDE activities. Figure 5 shows the reported MDE activities and their use frequencies. As can be seen in Figure 5, the studies differ in the number of the MDE activities they addressed. 72 studies addressed 1 activity, 46 studies reported 2 activities, 17 studies reported about 3 activities, and 5 studies reported 4 activities. Discussion on all these MDE activities is given in the following where they are sorted from most reported to less reported in these studies.   development [50]. Table 5 summarizes developed tools and languages and the corresponding papers.
It is worth noting that, some of the studies are just presenting a graphical or textual representation of the system, while another group of studies present a fully functional DSML and its tool, which consist of more than one MDE technique/phase, such as modeling, transformation and code generation. We consider this latter process as the system design. Therefore, in this SLR, we divide the primary studies in those defining only models and the other studies presenting complete design.
Studies which developed DSLs are given as follows: [51] proposed a DSML called CyPhEF that supports the development and validation of self-adaptive CPS. [52] developed a simple graphical DSML for CPS while a DSML for irrigation networks was developed in [53]. In [54], authors developed a DSL that helps in quick construction of co-simulations for CPS, the grammar of the DSL was implemented in Xtext, while the code generation implementation was defined in Xtend. A framework called Advanced Vessel Simulation (AVS) was developed in [55] which supports design and evaluation of racing sailboat simulations. The AVS metamodel was developed in EMF, and Sirius was used for developing the graphical editor. A textual DSL named CHARIOT was created with using Xtext in [56].
A DSL for managing different sensor configurations for a self-driving mini vehicle was developed in [57]. The domain knowledge, static semantics, and the abstract syntax of this sensor management DSL were defined with the Eclipse Modeling Framework (EMF). [58] developed a DSML for the design of networked control systems (NCS) using passivity for separating the NCS control design from uncertainties (i.e. time delays and packet loss). In [59], authors used an ecore-based meta-model to define the abstract syntax of the proposed DSML, and the concrete syntax was implemented as an extension of Simulink standard blocks.
Reference [60] developed two meta-models for representing and sharing incident knowledge of CPS. Meta-models were developed as Eclipse plugins. A metamodel for a systematic analysis of CPS threat modeling was developed in [61] using MetaGME, while [62] developed a metamodel using ADOxx and UML and they used it for the description of an end-to-end communication use case. A meta-model for the development of a smart cyber-physical environment was presented in [63].
Reference [64] developed a meta-model for flexibility and dynamic reconfiguration of the automated production systems by using Eclipse Modeling Framework (EMF). [65] used UML profile to develop a meta-model for modeling cyber-physical assembly systems. Also, in [66], UML class-diagrams were used to develop a meta-model called Smart Environment Metamodel (SEM) to design smart cyber-physical environments. In [67], the authors extended some metamodels of SysML/MARTE for capturing the characteristics of CPS like continuous behavior and stochastic behavior. The approach was implemented in GEMOC. They defined the abstract syntax using EMF, graphical concrete syntax in Sirius, and the textual concrete syntax using Xtext. Meta-models conforming to ISA95 and ISA88 standards were developed in [68] for monitoring the process of an oil production industry.
In [69], they used ADOxx to develop the modeling tool Cyber-Physical Systems for Industry (CPS4I) for the connection of CPS and conceptualizations of industrial applications in integrated models. A tool, called FTOS, based on openAr-chitectureWare (oAW) was developed in [70], which provides code generation for designing fault-tolerant automation systems. In [71], the authors presented BPMN4CPS which is an extension of BPMN 2.0 for handling CPS features. New extensions for MechatronicUML were developed in [72].

2) SIMULATION
40 of the studies (16.74%) reported simulation. 11 studies addressed exactly the simulation process. They can be summarized as follows: Only 1 study [87] developed a simulator. 2 studies developed meta-models [88], [89], and 8 studies [90]- [97] used existing tools for modeling and simulation. Remaining 29 studies incorporating simulation addressed the other activities (i.e. system design, transformation, V&V, etc.). Table 6 shows the tools and languages used for the simulation activities. Studies presented different reasons for using simulation, e.g. [51] presented simulation as a feature of the developed DSL and used it for efficiency and time analysis via MECSYCO co-simulation engine. Also [53] presented the simulation as a feature of the developed DSL and used it for performance analysis via MATLAB and EPANET. In [54], authors developed a DSL for constructing HLA-based co-simulations. Reference [58] used Simulink for time and network delay analysis. Reference [74] benefited from Robocode simulator to simulate a reconfigurable conveyor system's behavior and run it in the background (used it as a background simulation) to output time information and the coordinate for the generation of Java animation. Similar to [63], [98] used Simulink for time performance analysis. Reference [99] is another example for utilizing simulations for analysis purposes in which CPS Safety Analysis and simulation Platform (CP-SAP) was developed. Simulations were also used for security experimentation purposes like in [91], [97], [100].

3) TRANSFORMATION
38 studies (15.90%) presented transformations (listed in Table 7). 30 studies covered one transformation type (either M2M or M2T), 3 studies considered two transformation types, while 1 study [114] showed 3 different transformation types namely M2M, M2T, T2M. the transformation types presented by the other 4 studies was not clarified. Therefore, 39 transformations were presented in total and they are as follow: 28 M2M transformations, 10 M2T transformations, and one T2M transformation. Studies implemented M2M transformations can be categorized into two: First category covers the studies using existing tools and languages. [115] used Y2U tool to transform Statechart models to UPPAAL timed automata model. [116] presented M2M transformation by transforming Simulink simulation models to AADL architectural models using Assisted Transformation of Models engine. AADL and Modelica were used in [117], [118], where both Modelica and AADL were transformed to each other. In [119], authors used Critical Infrastructure Protection -Vulnerability Analysis and Modeling (CIP VAM) UML profile to transform UML models to Bayesian Network (BN) models. In [50], the authors transformed UML models to Distributed Embedded Real-time Compact Specification (DERCS) models with using GenERTiCA. In [120], the authors transformed a SysML model to a graph by employing GraphML. Other studies include: [98], [114], [121] used QVT, [122] implemented M2M transformation using Xtext, [74], [75], [123] used Graph Rewriting and Transformation (GReAT) for M2M transformation, while EXTEND is used for the M2M transformation in [70].
Studies which developed metamodel, tool, or language for the M2M, M2T, and T2M transformations, constitute the second category. In [124], they proposed a transformation method that transforms Simulink model to ECML model by designing metamodels for both Simulink and ECML. [101] developed the model translation tool UPP2SF that transforms UPPAAL timed automata models to Simulink/Stateflow. On the contrary, [5] developed a tool named STU that translates Simulink/Stateflow model into UPPAAL timed automata model. In [125], the authors developed a tool named ECPS Verifier that was used for the transformation of AADL models to UPPAAL timed automata. In [126], they presented a tool named Simulink/AADL Translator Tool (AS2T) that automates the transformation of the simulation models of Simulink to AADL models. Reference [127] presented a framework called Modana that helps transforming SysML and MARTE models into Reactive Modules Language (RML) and Modelica models.
Studies using UPPAAL for verification include the followings: [115] used UPPAAL to formally verify the safety properties of a medical guideline. A Domain-specific model checking (DSMC) for MECHATRONICUML using UPPAAL model checker was presented in [114]. A pacemaker was modeled and verified using UPPAAL in [101]. In [5], the authors used the UPPAAL tool for the verification of SIMULINK/STATEFLOW models after being transformed to UPPAAL timed automata. In [125], UPPAAL was used for the formal verification (i.e. model checking) of AADL models.
Other tools and languages used for model checking for verification include the following: Simple Promela Interpreter (SPIN) model checker was used in [75] to verify the Promela code. Also, in [138], SPIN was used as a model checker to verify the PrT net models after translating it to a Promela code. In [139], they used a probabilistic model checker called PRISM. The authors in [140] verified their protocols via timed model checking MECHATRONICUML.
Simulink/Stateflow was used in [102] to verify supervisory controllers for hierarchical systems. Simulink Design Verifier (SLDV) was used for the verification of the simulation models in [141]. In [142], they used SLDV to verify the behavioral models developed in Simulink in order to meet the requirements modeled. Furthermore, Object Constraint Language (OCL) was used for the verification of the static semantics of a meta-model presented in [57]. Also, in [51], OCL was used for defining and validating metamodel constraints. In [129], a verification of KeYmaera-QHP code in KeYmaera, a hybrid verification tool, was presented. In [108], the authors used FORMULA for metamodel analysis and verification. Frama-C was used in [143] to prove and verify a developed C code library. In [142], Assume Guarantee REasoning Environment (AGREE) tool was utilized to verify that the AADL architectural models satisfy the system requirements.
Studies, presenting validation, are summarized as follows: [69] developed a modeling tool (CPS4I) and a modeling method (SeRoIn) then validated them using Open Models Laboratory (OMiLAB). CHECK validation language was used in [70] to formulate tests for the detection of semantic design errors in the developed models. A generated code in [144] was tested and analyzed using Frama-C. In [113], the authors implemented simulation-based validation.
A tool, named Simulation and Verification of Hierarchical Embedded Systems (SHARC), was developed in [110] for the verification of the behavior of automotive safety-critical systems. In [137], the authors developed an ontology and used it as the validation mechanism.

5) MODELING
33 studies (13.81%) reported about modeling. This category encompasses studies which used existing languages/tools for modeling, wherein studies which developed a language/tool for modeling were included in the group of system design. Only 4 studies [148]- [151] did not report any tool, instead, they either proposed an approach for modeling CPSs or used equational models. Table 9 shows various languages/tools used for modeling by the studies.
In [107], [141], authors used Ptolemy II to model Medical CPS, while in [82] the behavioral model of a production nominal resource was modeled using Ptolemy II. In [106], authors modeled a Holter Monitor. They used Ptolemy II to model the device's functionality and UPPAAL for modeling system's state space and the transitions between them. A modeling approach called time-constrained aspect-oriented Petri net was presented in [152]. The approach combines discrete/continuous Petri nets and aspect-orientation for modeling CPS. In the work presented in [132], colored Petri nets were extended to probabilistic colored Petri nets for modeling and analyzing CPS attacks. Petri nets were also used in [153] for modeling smart grid threats.
View oriented approach was adopted in [117] for the description of different aspects of an aerospace CPS. Modelica was used for modeling the overall architecture of a lunar rover robot and the lunar rover robot's body structure model while AADL was used for modeling the navigation system of the lunar rover. Similarly, authors in [154] integrated AADL, UML and Modelica to model the requirements of a vehicular ad-hoc network. Further, in [105], AADL and Modelica were integrated to model big data-driven CPS. In [142], Simulink/Stateflow was adopted to model a generic patient-controlled analgesia infusion pump system for analyzing logical requirements and behaviors, while AADL was used for developing the architectural model of the system. YAKINDU statechart tools was adopted in [115] to model and simulate a stroke statechart model. Likewise, in [145], they used Yakindu statecharts for the modeling of a simplified cardiac arrest. The study in [155], created a UML statechart model for an envisioned CPS scenario using YAKINDU statechart modeling tool again. In [121], Papyrus tool was used for creating the UML models. The authors in [156] presented a methodology for knowledge representation of CPS using the modeling tool Papyrus. Reference [13] used UML for defining the dependability analysis models. Implementations of modeling CPS using HybridUML was presented in [129].
Finite state machine (FSM) was adopted in [157] to model the behavior of automation components. [112] used FSM to describe the logic of a servo tester. GME was used in [158] to build Lathe CNC System models and export models' data as an XML file. ASLan++ was used in [159] for modeling water treatment plant and attack model. Reference [150] presented a new formalism named Stochastic Occurrence Hybrid Automata and a modeling approach to model the stochastic behavior in CPSs.

6) CODE GENERATION
24 studies (10.04%) reported about language/tools used or developed for code generation purposes. Table 10 lists used and developed languages/tools for code generation. These studies are categorized here into studies which used existing tools for code generation, and studies which developed new code generation tools.
In [101], they used Simulink Real-Time Workshop Embedded Coder (RTWEC) to generate C code from a pacemaker Stateflow chart. Likewise, C code and VHDL code were generated in [5], [142] from the Stateflow models using Simulink coder. Moreover, a tool named GeneAuto was presented in [143] that generates C or ADA code from Simulink models. In [108], built-in code generator for Embedded Systems Modeling Language (ESMoL) was used to generate functional C code. IOPT-Flow tool framework was used in [109] to generate C and Javascript code or VHDL hardware descriptions.
Programmable Logic Controllers (PLCs) Code Generation was presented in [162] using Scenario Modeling Language (SML). In the same manner, implementation of PLC code generation was presented in [113] using Compositional Interchange Format (CIF). Clock Constraint Specification Language (CCSL) constraints were utilized in [147] for code generation purposes. The authors in [54] used Xtend for the support of code generation for OpenRTI. Kevoree Modeling Framework (KMF) was used in [81] for the generation of Java API in order to create and manipulate the runtime models.
In [163], they developed a tool named I2C4IOPT for automatic code generation of globally asynchronous and locally synchronous systems (GALS) -supported by Arduino boards. An ISA88 editor was implemented in EMF in [64] to generate a programmable logic controllers (PLC) control code. A code generator was developed in [58] for the generation of Simulink models and network-scripts. In [144], a model-based code generator for medical CPS was presented. An interpreter was developed in [75] to translate finite-state machine (FSM) models and constraints into Promela code.

7) SYSTEM ANALYSIS
15 studies (6.28%) reported language/tools which used or was developed for system analysis reasons. Table 11 lists the used and developed languages/tools for system analysis. Studies can be categorized into ones directly using existing tools for system analysis, and others developing new tools for system analysis purposes.
In [165], meta-models for operational analysis and system analysis were developed. They also used TTool for safety analysis. A knowledge-based approach using Failure Models, Effects and Criticality Analysis (FMECA) techniques was presented in [166]. The authors first modeled FMECA using UML class diagram, then the FMECA metamodel was expressed in Protégé, which was then used to build an ontology-based KB. A metamodel was developed to enable the management of application requirements and business constraints for CPS in [167]. CPS meta-model for knowledge formalization was presented in [168], where they also implemented formal concept analysis in their work. CPS Safety Analysis and Simulation Platform (CP-SAP) was developed in [99] for Human-machine interaction (HMI) safety analysis of CPS. A framework called Modana was presented in [127] that aims to model and analyze the non-functional aspect (i.e. time, energy, etc) of Energy-Aware CPS.
In [152], a modeling approach based on discrete/ continuous Petri nets was proposed for schedulability analysis. Also, a modeling approach was presented in [132] that supports both qualitative and quantitative analysis of CPS attacks using probabilistic colored Petri nets. CPS dependability analysis was presented in [169] using Stochastic Petri Net (SPN). An approach for specification and analysis of automotive CPS was presented in [154] where Modelica was used for analyzing engine model and AADL was used for End to End Delay Analysis from brake-pedal to throttle actuator. Security analysis tool CL-AtSe was used in [159] for analyzing and discovering potential attacks on Industrial Control Systems.

B. DEVELOPED TOOLS/LANGUAGES FOR APPLYING MDE ON CPS
In this section, the results and findings of RQ2: Is/Are there any tool and/or DSL developed for MDE of CPS by the study? and its following sub-questions are presented.

RQ2.1: If any tool and/or language is developed in the study, is it reported?
59 studies developed DSLs/DSMLs, metamodels, tools, or their extensions. 22 of these studies developed metamodels, 15 studies developed DSLs/DSMLs, 18 studies developed tools (including 2 frameworks and 1 platform), and 4 studies presented extensions for other tools/languages as shown in Figure 6. These developed tools/languages were addressed in a detailed way in the correlation analysis performed between RQ1 with RQ2 and RQ3 which is presented previously in section V-A.

RQ2.3: What is/are the framework(s) or programming language(s) for its/their development?
As can be seen in Figure 7, UML is the most used language, followed by EMF and GME. Figure 7 shows the correlation between RQ3.1 and RQ3.3. It is clear that UML is mostly used for building metamodels, where EMF is used for building both metamodels and DSLs, and GME is mostly used for building DSLs. Other presented tools are; GreAT that is used alongside with GME [74], [75], Sirius is used for building the graphical concrete syntax [55], [67], [78] and finally, Xtext is   used for developing DSL grammar in [54], and for building the textual concrete syntax as in [67]. These tools/languages are listed in Table 5 and discussed in detail in section V-A. GME seems to be the third most used framework for building DSLs and metamodels. However, it is worth mentioning that the results of distributing RQ3.3 over the publication years (see Figure 8) show that GME is not used for the last 2 years (2017 and 2018) of the examined period by any of the primary studies. Further, the results of RQ3.3 were distributed over authors' country of affiliation as depicted in Figure 9. The study found out that GME and its tool GReAT were only used by authors/researchers affiliated to the USA, where on the other hand, UML and EMF were mostly used by authors/researchers affiliated to Europe.

C. USED TOOLS/LANGUAGES FOR APPLYING MDE ON CPS
In this section, the results and findings of ''RQ3: Is/Are there any tool(s) and/or Language(s) used to apply MDE in/for CPS in the study?'' and its following sub-question are presented.
RQ3.1: Which tool(s) and/or language(s) is/are presented/used in each phase of the system development? Figure 10 shows tools and languages used by the primary studies. It is worth indicating that a correlation analysis between RQ1 with RQ2 and RQ3 has been already presented in section V-A for better understanding of the tools/languages, and to give the reader a clear idea about the MDE phase/activity in which the tool/language was used for. Therefore, in this section, only the most used tool/languages are briefly discussed.
Our study found that Simulink is the most used tool. Majority of the reviewed studies used Simulink for simulation purposes, listed in (Table 6). Simulink was also used for modeling [142], [146]. Simulink coder was used for code generation purposes in [5], [101], [142] while Simulink Design Verifier (SLDV) was adopted for the verification of models [141], [142], [146]. AADL follows Simulink as the second most used tool. It was used for modeling the cyber part of the systems [117], developing architectural models [142], or for system analysis [154]. UML is used by various studies for building metamodels, listed in Table 5. It was also used for modeling activities like defining dependability analysis models [13]. The vast majority of the studies used UPPAAL for verification, see Table 8. For instance, [106] used UPPAAL for modeling system's state space and the transitions between them.

D. ADDRESSED CPS COMPONENTS
In this section, the results and findings for ''RQ4: What is/are the CPS component(s) addressed in the study?'', and a correlation analysis of RQ1 with RQ4 are presented.
According to [172], a CPS mainly consists of 5 components, which are, Physical components, Cyber components, Sensors, Actuators, and Network. Amongst the 140 primary studies covered in this SLR, only 6 papers were left undetermined (the addressed CPS component by these papers could not be determined) and 9 studies addressed more than 1 CPS component. Figure 11 shows the categories of CPS components. The full list of studies and their supporting CPS components is given in Table 12.
• Both Cyber & Physical components: Reported by 26 studies (17.8%). This category contains the studies discussing modeling both cyber and physical aspects of the system. [176] reported about modeling a controller (cyber component), and a plant (physical component). Another example is [117] where the authors modeled a lunar rover robot's body (physical component) and its navigation system (cyber component).
• Actuators: Reported by only 5 studies (3.4%). This component is less addressed one compared to the other CPS components. For instance, [116] covered actuator modeling and design, while [13] discussed actuator failure.
• Other: Studies, which do not fit any of the above categories, are grouped under this category. They consider VOLUME 9, 2021 business processes [86], workflows (process) [79], and data [178].
Further, in this SLR, a correlation analysis of the MDE activities and CPS components is scrutinized so as to provide an understanding of the addressed MDE activities in each CPS component development, (see Table 13). Despite the fact that the correlation analysis cannot indicate the CPS domain wholly, one can see that, for instance considering the Cyber component, most research works concentrated on Transformation (22 studies), V&V (18 studies), Simulation (17 studies), Code generation and System design (16 studies each), while Requirement analysis and System analysis were addressed only by 4 and 3 studies, respectively.
Similarly, for the Physical component, the research work converged on Simulation (10 studies), System design and V&V (8 studies each), and Modeling (7 studies) while System analysis and Requirement analysis are again less addressed (3 studies and 1 study respectively). In terms of the sensor component, most research works concentrated on System design (6 studies), however, System analysis for sensors was not addressed by any study. Regarding the actuator component, which is the least addressed CPS component, it is interesting to note that Code generation, Simulation, Requirement analysis, and System analysis were not addressed by any study.
Finally, taking into consideration the CPS components, one can also deduce from above results that the Actuators are currently the least studied components during MDE of CPS. Furthermore, Requirement analysis and System analysis are the least addressed activities for developing all CPS components.

VI. DISCUSSION AND THREATS TO THE VALIDITY
In addition to the detailed assessments given in the previous section for the results of the conducted SLR, this section includes a more general discussion of the achieved findings along with its implications. Threats to the validity of the study are also discussed in this section.
Regarding MDE activities/techniques addressed (RQ 1), the most considered MDE activity was system design. Researchers developed DSLs (15 studies), metamodels (22 studies) and tools (18 studies) for this purpose. The results show that the total number of studies developed DSML is quite low (10.71%, only 15 studies out of 140) for a wide and complex domain like CPS. Furthermore, considering the fact that though general-purpose modeling languages such as UML and SysML have rich tool support, they tend to be too general to consider domain-specific aspects, and they also lack the detailed formal semantics needed for formal analysis. On the contrary, DSLs are specialized modeling languages that are developed according to the needs of a specific application domain. Consequently, they constitute a good foundation for domain-specific formal analysis and automated tool support. Further, DSLs tend to use notations domain experts are familiar with, which in return makes them mostly accepted. [52], [186]. Therefore, one can deduce that there is an opportunity for conducting more research to design DSLs to address different aspects of CPS development life-cycle and to cover as much application domains as possible. DSLs can provide a higher level of abstraction for complex systems such as CPS which may lead to increase the performance and to decrease the time and the cost of CPS development. Simulation was the second most reported MDE activity (40 studies, 16.81%). Apart from 1 study that developed a tool [87] and 2 studies that developed metamodels [88], [89] for simulation purposes, the rest of the studies (37 studies) used existing simulation tools and languages.
Furthermore, analysis of the results for RQ 1 showed that M2M transformation gains more attention in terms of the existing/developed tools and languages in comparison with the other transformation types, M2T and T2M. In addition, it is observed that languages like GenERTiCA and Xtext were used for the implementation of more than one transformation type. Also, it is worth mentioning that tools like UPP2SF and STU can be used as complementary tools for M2M transformation. Other complementary languages for modeling CPS are Modelica and AADL, where Modelica is used for modeling the physical world and AADL for modeling the cyber components, and the transformation between these two languages do not require any third-party tool or language [117], [118]. V&V was reported by 35 studies (14.71%,). However, apart from 2 studies [110], [136] which developed a new tool and one study that developed an ontology [137] specific for CPS, the rest of the studies preferred using the existing general purpose V&V tools. tion of these tools and provide the adoption of them in various industry fields. The results of RQ3.3 revealed that UML, EMF, and GME were the most used tools and environments since they were used in 59 different studies while developing various DSLs, metamodels, and tools. This is somehow expected due to the widespread use of UML during system modeling in different application areas as well as many popular modeling language creation workbenches are EMF-based. However, GME was not present for the last 2 years (2017 and 2018) in any study. Findings also exposed that GME was mostly used by the researchers affiliated to USA, while UML and EMF were mostly used by the researchers affiliated to Europe.
Regarding the addressed CPS components during MDE of CPS (RQ4), one of the interesting results shows that most of the primary studies consider only the cyber and physical components of CPS during MDE. There is a limited number of work also focused on the components such as sensors, actuators and networks. This may have several reasons. For instance, some of the studies (e.g. [64], [68], [102], [132], [175]) represent the sensors, actuators, network elements, etc. under the umbrella of the physical plant, hence they do not include models specialized for these components other than the physical model. Another reason can be modeling the systems with the all details covering such as sensors and actuators may lead to very large and complex models (to be used in MDE) especially when we consider big CPS and that can make the application of MDE less efficient and almost not feasible.
This study is meant to provide practitioners with insight into which languages/tools are most commonly used when applying MDE techniques for CPSs development, and also to help them easily identify how these languages/tools have been used in the various MDE techniques/activities. Moreover, based on the findings given above, the conducted SLR may guide the researchers in shaping their future work on the MDE of CPS by taking into account the following open issues: • The distribution of MDE techniques over the targeted CPS components shown in Table 13 sheds light on the areas where the research efforts are concentrated and areas where potential research can be undertaken. For instance; -Modeling sensor and actuator components of the CPSs and generation of the executable artifacts from the corresponding models can be investigated since the current studies on them are so rare. -Similarly, the MDE research on the requirement and system analysis activities as well as simulation for creating sensors and actuators components needs to be increased which may improve the coverage and functionality of the design models for CPS.
• DSLs and domain-specific simulation and verification tools specific for CPS are currently rare. Using such languages and tools can help the developers while dealing with the complexity of CPS by working in a higher level of abstraction.
• Although the SLR in here both determined the languages/tools and revealed how they are utilized during MDE of CPS, additional work on comparing these languages/tools possibly in a qualitative or quantitative manner may help the formalization of a detailed guide for the language/tool selection in the field of MDE for CPS.
• The reasons for the decline of Generic Modeling Environment (GME) may be further investigated since our results showed that although GME is constantly used from 2011-2016 in the primary studies, it has not been used for the last 2 years (2017 and 2018) of the examined period.

THREATS TO THE VALIDITY
Threats to the validity for this SLR study are classified according to categories proposed in [187], and hence they include four types, namely construct, internal, external and conclusion validity threats. Construct Validity: It represents how the SLR study truly reflects the intent of the researchers, and what is asked by the research questions. To define the research questions, it is important to stress that the process proposed by [38] and [39] and guidelines defined by [10] were followed in this study.
Furthermore, another aspect of construct validity is to assure that all relevant studies on the selected topic are found adequately. The possibility of missing primary studies is a common threat to the validity of any SLR. For instance, although CPS is a well-known and widely-used concept, keywords such as ''embedded systems'' and ''real-time systems'' might be used previously to refer to the systems that are also CPS. However, we believe that omitting such keywords caused a negligible risk in this SLR especially considering the publication years covered. Thus, the terms listed in Table 3 are sufficiently good enough to be used as keywords to find the most appropriate studies as planned at the beginning of the study. Moreover, to mitigate this risk, search strings were also formed through several iterations which also enabled the adequate coverage of the literature. Another option for selecting primary studies can be starting the search directly from major venues including journals and conferences which are well-known in the CPS research. Such a search strategy may be effective in determining the significant studies in the related field at first sight which may pave the way of e.g. preparing in-depth survey of studies mostly in a more specific field. For instance, that selection approach can be used in providing a survey and evaluation of model transformations performed during MDE of CPS. However, it causes gathering relatively a small set of studies and many studies published in other venues may be neglected when we consider SLRs instead of surveys. Hence, as suggested in the SLR guidelines given in [10] and mostly applied in the current SLRs covering other research domains, we also preferred searching general publication digital libraries/databases to determine the primary studies using the search strings which leads performing the SLR both transparent and replicable. It is also worth indicating that these digital libraries already index most well-reputed publication venues and the list of publication venues included in our online repository for this SLR verifies that the coverage of the search is enough within this context. Furthermore, to improve the results, the forward snowballing sampling method was used, and it has proved to be effective.
Internal Validity: This relates to the degree to which the design and the conduct of the SLR study are likely to prevent systematic errors. Internal validity is a prerequisite for external validity [10]. Therefore, both qualitative and quantitative analysis were used to minimize threats. The use of a rigorous protocol and data extraction form mitigates this kind of threats to validity. Moreover, threats originating from personal bias or lack of understanding of the study were reduced by conducting data extraction phase iteratively. For this purpose, one researcher extracted data from the primary studies and answered quality and self-assessment questions. The other two researchers (expert in CPS and MDE) reviewed the extracted data from studies with low self-assessment rates under 50%.
External Validity: According to [187], external threats concern the generalizability of the SLR results, that is, the degree to which the primary studies is representative of the reviewed topic. In this study, the set of primary studies may not be representative of the entire set of existing studies on the topic, MDE for CPS. However, this threat was mitigated as follows; Firstly, the search strategy consisted of manual and automatic search, then followed by the forward snowballing. The forward snowballing enabled finding studies which were not captured by the search strings in the digital libraries. Secondly, the inclusion and exclusion criteria of the protocol created in this study support refining the set of primary studies which leads to include only studies which meet the topic. Only studies in English were included. Papers written in other languages concerning the same topic may exist. However, this threat is considered as having minimal effect.
Conclusion Validity: All relevant primary studies may not be identified [10]. To alleviate this threat, the research protocol of this study was designed and validated carefully to minimize the risk of excluding relevant studies. Search strings were formed in a way that only a very small number of relevant studies could be missed, and a manageable quantity of irrelevant studies could be included. Besides the automatic search, a manual search and a forward snowballing were performed. We did not apply backward snowballing in addition to forward snowballing since some references achieved by the backward snowballing would be out of our search range, i.e. it would cause access and force to examine the papers published before 2010. Elimination of these old-dated papers would have an additional cost with probably very limited benefits. We already had a large pool of papers. The protocol was rigorously defined to be reusable by other researchers for reproducing the same study, i.e. the protocol is available on IEEE Data Port. 4

VII. CONCLUSION
CPS have proven to offer tremendous opportunities in almost all areas of industry and society. Due to its inherent heterogeneity and complexity, engineering and managing such systems is known to be a challenge for the developers. Thus, numerous researches were conducted and are still being conducted in this domain.
The aim of this study was to identify the current features of the use of MDE for CPS. For this purpose, an SLR of the papers in the field, published between 2010 and 2018, was performed. The initial search retrieved 646 papers of which 140 were included in this study by following the defined selection strategy through a multi-stage process. A key feature of this SLR is that it is not restricted to a particular CPS domain. This broad scope in the search gives deeper insights into the state-of-the-art of using MDE for CPS. Findings contribute new knowledge that can be used to improve CPS development using MDE.
The study points out that MDE for CPS is an active research area with an increasing number of publications over the years. Moreover, the study shows the covered areas in addition to the languages and tools which have been used/proposed. Regarding the CPS components, our study also exposed that the MDE effort was mostly put on the development of cyber and physical components, where the other components (networks, sensors, and actuators) did not get much attention. Study results revealed that solutions based on UML and Eclipse-based tools were mostly preferred.
Finally, the study also presented some open issues in MDE of CPS for researchers and practitioners to assist shaping their future work in this area. For instance, designing and implementing sensors and actuators used in CPS with MDE and developing domain-specific simulation tools require further investigation.
MUSTAFA ABSHIR MOHAMED received the B.Sc. degree in information communication technology (ICT) from Gollis University, in 2015. Then, he was awarded a Scholarship from the Turkish Government under the program (YTB) for his master degree. In 2020, he completed his graduate studies at the International Computer Institute, Ege University, and received the M.Sc. degree in information technologies.