Attack Detection for Medical Cyber-Physical Systems–A Systematic Literature Review

The threat situation due to cyber attacks in hospitals is emerging and patient life is at risk. One significant source of potential vulnerabilities is medical cyber-physical systems (MCPS). Detecting intrusions in this environment faces challenges different from other domains, mainly due to the heterogeneity of devices, the diversity of connectivity types, and the variety of terminology. To summarize existing results, we conducted a structured literature review (SLR) following the guidelines of Kitchenham et al. for SLRs in software engineering. We developed six research questions regarding detection approach, detection location, included features, adversarial focus, utilized datasets, and intrusion prevention. We identified that most researchers focused on an anomaly-based detection approach at the network layer. The primary focus was on the detection of malicious insiders. While several researchers used publicly available datasets for training and testing their algorithms, the lack of suitable datasets resulted in the development of testbeds consisting of various medical devices. Based on the results, we formulated five future research topics. First, the special conditions of hospital networks, the MCPS deployed within them, and the contrasts to other IT and OT environments should be examined. Thereupon, MCPS-specific datasets should be created that allow researchers to address the health domain’s unique requirements and possibilities. At the same time, endeavors aimed at standardization in this area should be supported and expanded. Moreover, the use of medical context for attack detection should be further explored. Last but not least, efforts for MCPS-tailored intrusion prevention should be intensified. This way, the emerging threat landscape can be addressed, IT security in hospitals can be improved, and patient health can be protected.


I. INTRODUCTION
The healthcare sector faces an increasing threat of cyber attacks. A Comparitech study explored the threat landscape of the US sector and found that in the time 2016 to 2022, 6,835 healthcare companies were hit by ransomware [1]. Already in 2014, a SANS report admonished the risks of MCPS and identified ''Nontraditional medical endpoints'' as one of the main malicious traffic sources [2]. Gartner predicts for the year 2025 that operational technology (OT) environments will be weaponized to harm or kill people and that the The associate editor coordinating the review of this manuscript and approving it for publication was Peng-Yong Kong . resulting financial impact from such attacks will amount to $50 billion per year [3].
Coventry et al. surveyed hospital staff to determine the reasons for clinics' high IT security risks. One key finding is that medical device software is often outdated and unsupported [4]. This corresponds to the report of the European Union Agency for Network and Information Security (ENISA). They stress that legacy software and unpatched vulnerabilities are particularly critical in the healthcare sector. Accordingly, imaging systems, patient monitoring, and medical device gateways root for 86% of hospital security issues [5]. As Coventry et al. emphasized, securing legacy medical devices seems more crucial than ever. Despite these findings, current security scanners often fail to detect vulnerabilities in the healthcare environment because they do not have modules for medical devices and systems [6].
Intrusion detection is a technology that has been around for more than three decades [7]. Many organizations rely on it even more in a time of increasing cyber threats. Its most widespread application is in the field of (office) information technology (IT). In contrast, the used OT is not as covered in many sectors. Only in recent years, there have been efforts to transfer insights from IT to OT, primarily because of the emerging threat situation [8]. Many sectors can interoperate and share their sector-specific discoveries and perceptions. The health sector differs in this regard. There are three reasons: The heterogeneity of OT devices in healthcare, the diversity of connectivity types, and the variety of terminology. The heterogeneity of devices and missing regulations lead to a situation where no central management of devices from assorted manufacturers is possible. Furthermore, the different needs of different devices lead to different requirements for connectivity. E.g., while computed tomography scanners are regularly connected via a wired connection, wearable medical devices require a wireless connection for obvious reasons. Researchers cover this domain as wireless sensor networks (WSN), of which subgroups are medical smartphone networks (MSN) and wireless body area networks (WBAN). These networks consist of devices known as wearables, which are worn by a person and can also be connected to each other. Since devices of those groups are carried around, they not only have special requirements for connectivity but also for an intrusion detection system (IDS). In addition, the need for real-time detection for all MCPS, regardless of the device type, is argued to be even more critical than in existing mechanisms because lives could depend on a timely detection [9]. While some researchers try to detect intrusions in a protocol-agnostic way, others are motivated by the particular conditions of a subset of medical devices. This leads to different categorizations and definitions of groups and, thereby, to various terms. We further discuss the diversity of terms in section III-B.
This paper aims to outline the existing research on attack detection in the healthcare sector. We focus on the current state of research in detecting attacks on medical devices available for hospitals and clinics. The challenges of hospital networks, attached medical devices, and the plethora of protocols used by those devices are of particular interest. By discussing this environment's background and special requirements, we identify research gaps and give future endeavors direction.
The paper is organized as follows. In section II, we present other secondary studies and point out how this paper complements the existing work. Thereafter, in section III, we describe our methodology and present the research questions, according to which we have evaluated the studies. The results are presented in section IV. In section V, we discuss the findings and work out the implications. Finally, we draw a conclusion in section VI.

II. RELATED WORK
Existing reviews and survey papers on MCPS attack detection can be summarized into four groups. The papers of the first group discuss work about general IoT and merely touch the area of medical devices. They refer to MCPS either as motivation or to highlight them as a unique area with particular characteristics. One example is Banerjee et al., who discuss the security of several sectors in which IoT is used and how blockchain could improve it in the future. The healthcare domain is a characteristic example in which very sensitive data must be shared, and privacy is essential [10].
The second group concentrates on medical device security and attempts a comprehensive overview. Either they use broad definitions of security and include not only IT security but also privacy and patient safety, or the review outlines several IT security measures. Examples are Yaacoub et al., Tervoort et al., and Ferrag et al., who provide a detailed overview of relevant attack scenarios for medical devices and discuss which defensive measures can protect the devices from which attacks. These measures range from technical to non-technical aspects. Yaacoub et al. recommend a layered security architecture, ranging from raising awareness through employee training to sophisticated intrusion detection, mainly through a machine learning (ML)-based intrusion detection and prevention system cooperating with honeypots and security information and event management (SIEM) to gain the latest insights into attacks [11]. Tervoort et al. conduct a scoping review presenting an overview of security solutions for medical software vulnerabilities that do not require the software to be replaced. Besides intrusion detection, monitoring specific aspects of medical devices, such as software execution characteristics and tunneling legacy protocols, have been examined [12]. Ferrag et al. outline security solutions for the Internet of medical things (IoMT) of five categories: authentication and access control, key management and cryptography, intrusion detection systems, blockchain-based solutions, and privacy-preserving solutions [13].
The third group of papers focuses on a specific aspect or a specific type of approach. Thomasian and Adashi summarize the policy and regulatory measures (primarily concentrated on the US) to secure medical devices. Furthermore, they provide an overview of the emerging threats in this context on a high level [14]. Hameed et al. elucidate ML-based approaches in their structured review of security and privacy in the context of the IoMT. Besides insightful statistical data surrounding the publications, such as the geographical distribution of research groups and the development of publications per year, a focal point of the work of Hameed et al. is ML-based intrusion detection [15]. Rbah et al. concentrate their efforts on comparing deep learning methods utilized for IDSs in the IoMT. They observe that many researchers develop their approaches in an isolated environment for a limited number of attacks [16]. Pelekoudas-Oikonomou et al. review blockchain-based security mechanisms for IoMT VOLUME 11, 2023 edge networks. While they describe several ways in which attack detection in IoMT-edge networks could benefit from a blockchain extension, they state that, to their knowledge, there are no blockchain-based IDSs specifically designed for IoMT-edge networks yet. Instead, they outline approaches from other IoT environments and show how they could be applied to IoMT-edge networks [17].
The fourth group of papers compares approaches tailored to small subsets of MCPS. Eliash et al. discuss the security of the subset of medical devices used in intensive care units (ICUMDs), introduce a taxonomy for these devices, and explain how these devices interact with each other. They develop scenarios for 16 attacks on medical devices and derive the main building blocks. Additionally, they analyze the applicability of existing security mechanisms, including detection mechanisms [18]. Similarly, Kintzlinger and Nissim establish a taxonomy for personal medical devices (PMDs) and collect attack scenarios and building blocks for attacks on this group of devices. Furthermore, they review the existing security solutions and identify the gaps between them and the identified attack vectors [19]. Ghosal [21], and Wazid et al. compare detection approaches for malware in the IoMT environment [22]. This paper differs from those presented as it aims to provide an overview of all intrusion detection approaches for all kinds of medical devices available for hospitals and clinics. The main contributions can be summarized as follows: • We present the current state of research on attack detection in medical cyber-physical system environments.
In particular, we show the various challenges that are special or unique to the health sector and frame our research questions around these specifics.
• As a distinct difference from other secondary studies based on a single or small number of keywords (e.g., IoMT), we identified 22 synonyms for MCPS. We included them in an extensive database search as a basis. The high number of synonyms allows a comprehensive and profound analysis of the research state.
• By following the guidelines of Kitchenham et al. for structured literature reviews in software engineering, we minimized the risk of a biased consideration of the studies available. This includes: 1) A structured two-step screening process 2) Transparent inclusion and exclusion criteria for study selection 3) The independent review of studies by at least two researchers in every selection and extraction step.
• By answering six research questions, we structure the confusing and convoluting state of literature and highlight commonalities and differences. For exceptional approaches, we present a detailed description.
• We critically engage with the selected aspects of the research and discuss the applicability of the proposed approaches.
• The resulting discrepancies will help researchers conduct more focused research through five derived future research topics.

III. METHODOLOGY
We adopted the guidelines for performing systematic literature reviews (SLR) in software engineering [23]. According to Kitchenham et al., the goal of such a review is threefold: Firstly, the review shall summarise the existing results in a field. Secondly, it should identify gaps in the current research, and thirdly, it should provide the background to position future research endeavors.

A. RESEARCH QUESTIONS
We developed six research questions to determine the state of research in the field of MCPS attack detection. These are outlined in the following.

1) WHICH DETECTION APPROACH IS USED?
First, we wanted to ascertain what detection approaches are utilized most to detect attacks in hospital environments.
Research knows three types of IDSs: • signature-based detection • anomaly-based detection • specification-based detection The two best-known subcategories are signature-based and anomaly-based IDSs. Signature-based IDSs use predefined patterns of known attacks to detect intrusions in a patternmatching approach. The major downside is that those systems can only detect known attacks. Even the smallest changes that modify the signature of the attack might evade detection. The upside is few false alarms.
The counterpart is anomaly-based intrusion detection which has drawn much interest in the research community. Those IDSs model the expected behavior of a system or network and warn in the case of deviation from baseline behavior. Advantages and disadvantages are contradictory: While this approach might detect even zero-day attacks, it is difficult to consider every borderline case in the baseline, which ultimately leads to a higher count of false positives. Other often-named challenges in the context of MCPS are limited sources of energy and constrained computational power. Often, especially in the case of wearable devices, those resources are already utilized by the device's primary purpose, so few resources remain for the intrusion detection algorithm. Moreover, even if one may argue that some wearables have an easily changeable battery, the need for energy-saving algorithms and protocols cannot get clearer for implantable medical devices (IMD). Consequently, motivated by these considerations, several research endeavors focus on energy and resource-efficient intrusion detection approaches (e.g., [24], [25], [26], [27]).
A third category, sometimes also considered a subcategory of anomaly-based IDSs, is specification-based intrusion detection [28]. In this approach, all possible behaviors of the given medical device are specified. The device's operation is then monitored. An alarm is triggered if the device transitions to an unspecified operating state. We decided to follow Mitchell and Chen's definition and consider specificationbased intrusion detection as a standalone category [29] because the medical sector offers unique possibilities for specifications. It, therefore, enjoys special attention in the field of MCPS intrusion detection. The researchers promise that it combines the advantages of signature-based and anomaly-based detection, namely the ability to identify previously unknown attacks while limiting false positives and requiring less computational power than ML-based anomaly detection. However, very detailed knowledge of the monitored medical device is needed, and this approach is therefore associated with a high initial implementation effort.
Furthermore, hybrid approaches combine two or more variants into a new approach. Here it is essential to state that several authors combined different ML algorithms and called their approach hybrid. Since the distinction to, e.g., ensemble learning methods was too small from our point of view, we did not follow this subsumption. Therefore, we only classified an approach as hybrid if it comprised variants from different main categories (e.g., anomaly-based and signature-based).

2) WHERE IS THE ATTACK DETECTION SYSTEM LOCATED?
Classically, there are two locations where attack detection systems are usually placed. On the one hand, a host-based IDS (HIDS) runs on the device and monitors the station's operating system, processes, or logs. On the other hand, a network-based IDS (NIDS) inspects the network traffic and often monitors the traffic of all devices connected to the network. The locality of the NIDS, particularly in segmented networks, can, in turn, influence its effectiveness and therefore be decisive. Both locations have their advantages and disadvantages. An NIDS is able to detect external threats at an early stage, but the mass of data can cause limitations, especially in large networks. While an HIDS might not notice external threats as early as an NIDS, it might detect malicious insiders that remain hidden to NIDSs [28]. In addition to this distinction, we observed a third location often chosen by the researchers in the MCPS domain: cloud or cloudlet-based IDSs. Here, too, hybrid approaches are conceivable and in other sectors pervasive.

3) WHAT KIND OF DATA IS ANALYZED BY THE ATTACK DETECTION APPROACH?
This question often interrelates with the location of the detection system (or at least with the collector's location). At the network level, detection approaches might use metadata of captured packets or analyze the whole packet, more or less understanding the entailed sector-specific protocols. At the host level, various information about the operating system, processes, or log files can be evaluated. Of course, all this data can also be conglomerated in a cloud to be processed centrally.

4) WHAT ATTACK SCENARIO IS THE PRIMARY FOCUS OF THE DETECTION SYSTEM?
Frequently, detection approaches specialize in the defense against specific scenarios. This is because an outside attack is detectable by different indicators than an insider abusing valid privileges. We identified the scenarios with the greatest research interest and those that may be underrepresented in current research.

5) WHICH DATASETS AND SOURCES ARE UTILIZED TO EVALUATE THE EFFECTIVENESS OF THE DETECTION APPROACH?
Publicly available datasets make the detection approaches of different researchers comparable. Sometimes, however, researchers cannot find a dataset that fits their use case and look for alternatives. Some build test environments with simulators or real devices, while others generate data in other ways. We examined the approaches and the most used datasets and -sources in the field of MCPS attack detection.

6) WHAT APPROACHES GO BEYOND DETECTION AND ALSO INCLUDE PREVENTIVE MEASURES?
It is often of particular research interest to not only detect but also mitigate attacks as quickly as possible. This is also appealing in healthcare, as any attack might endanger human life. On the other hand, one of the biggest challenges in the field of attack detection, especially in the case of anomaly detection, is the false-positive rate. This gets even more relevant if automated mitigation measures are taken. By reviewing the relevant articles, we explored how researchers address the potentials and risks in this regard.

B. IDENTIFICATION OF RESEARCH
To capture the current state of research, the variety of terms used in the literature for networked medical devices alone necessitated a structured approach. We were not the first to find that IT security terms and definitions diverge in healthcare. Athinaiou et al. surveyed the IT security language and observed that definitions of concepts differed in health environments [30]. We identified 22 terms used in reference to such systems (Connected Health, Connected Healthcare, Digital Healthcare, e-health network, Healthcare  While there are no precise definitions, it is our impression that term combinations of medical/health and internet of things (IoT) like IoMT, mIoT, or IoHT have been used to refer not only to medical devices in hospitals but also to devices used to monitor specific health values at home. In contrast, the term MCPS was used almost exclusively for medical devices in a hospital context. However, this observation did not apply to all publications, and we noticed a convergence of the device classes. Researchers hypothesize that all sensors monitoring patients' health parameters in hospitals will be connected to local gateway devices in the future [31]. This evolution can already be observed and is the reason for the prevalence of so-called medical device gateways in hospitals that connect medical devices to the hospital network. According to ENISA, these devices presently account for 34% of all devices in the healthcare sector [5]. Besides, it is quite similar to the convergence of general IoT and cyber-physical systems (CPS). NIST established in a special publication in 2019 that the concepts of CPS and IoT have become more and more equal and that the definitions can recently often be used interchangeably [32]. However, to clarify that this work focuses on detecting attacks on medical devices available for hospitals, we used the term MCPS.
In addition, we identified five expressions describing the detection of attacks (Detection, Network Security Monitor, Network flow, IDS, and Intrusion Prevention). The combined search strings were employed to search five electronic libraries. The results per library can be seen in table 1. In total, we obtained 5354 papers matching our search strings.

C. SELECTION OF PRIMARY STUDIES
Following Kitchenham et al., two authors performed a two-step screening of all obtained papers and selected those relevant to the research topic. To make the process comprehensible and verifiable, we defined the selection criteria in tables 2 and 3. In the first quantitative screening, the title and abstract of the publications were evaluated. The vast majority of the papers was excluded in this step. For the qualitative screening, 358 papers remained. The high rejection rate is attributable to the fact that many intrusion detection synonyms are also used in medical regard. Two examples of major fields in medical research are disease detection and monitoring of patients' health parameters utilizing various medical devices. Unfortunately, those terms could not be excluded from our search terms for obvious reasons, which led to a high rate of false positive results.
In the following qualitative screening, the full-text versions of the 358 papers have been consulted to single out those relevant to our research questions. The papers were screened by two researchers independently, and the resulting selection of included papers differed. The agreement has been measured using the Cohen Kappa statistic [33]. The initial value of the Kappa statistics was 0,826. Afterward, all disagreements were discussed and resolved. In the end, 118 papers were selected for data extraction.

D. DATA EXTRACTION AND SYNTHESIS
For data extraction, the remaining studies were read in full and categorized by the research questions defined in section III-A. Thereby, we were able to answer the questions as comprehensively as possible. Here we followed the recommendation of Kitchenham et al. and assigned one researcher as the data extractor and the other as the data checker. Emerging disagreements have been discussed, and all researchers have agreed on the final classification.
Finally, the results of the review were summarized. In the following section, we will provide the gained insights.

IV. RESULTS
We identified 118 papers that could contribute to answering the research questions. However, not every paper could be consulted to answer every research question. One example is the study by Ardito et al., who outline a framework but did not implement it or test it using a dataset [34]. Therefore, while we were able to use this publication to evaluate the proposed detection approach (RQ 1), it was not suitable for answering the question about the used data sources (RQ 5). The exact number of papers included in the evaluation of each research question is indicated in each subsection.

A. RQ 1-UTILIZED DETECTION APPROACH AND EMPLOYED TECHNOLOGY
We identified three main approaches in the context of MCPS attack detection: anomaly-based detection, signature-based detection, and specification-based detection. As shown in table 4, most researchers focus on an anomaly-based detection approach (98). The majority proposes an ML algorithm they tweaked to be most suitable for MCPS (42). Often, the approaches consist of an optimized feature/dimensionality reduction algorithm and an ML algorithm that performs the actual detection. While most researchers substantiate why their approach works best (e.g., Saheed et al. with their swarm-based approach [105]), others focus on optimizing parts of their approach. E.g., Priya et al. measure the benefits of different dimensionality reduction approaches [95]. The detection algorithm then classifies the traffic, flow, or packet into malicious/benign (binary classification) or even categorizes it into specific attack groups. One example of the latter is the work of Mowla et al., which attempts to identify an attack and classify the attack type [86]. Astillo et al. focus on one specific MCPS: a diabetes management control system consisting of three separate components: a sensor that steadily measures a patient's glucose level (continuous glucose monitor (CGM)), an insulin pump, and a controller. Their detection approach first estimates the blood glucose level of the patient. Thereafter, estimated and actual values are compared and derived as features. Eventually, the classification module evaluates if the current event cycle is anomalous [44]. Khan et al. criticize that researchers have so far focused on optimizing accuracy and false alarm rate while no attention has been paid to interpreting the prediction model. Therefore, they use an explainable model that provides information about the features leading to the prediction. Their motivation is to help security personnel to react timely and in the right way to an alarm and to increase trust in their detection model. They explain this is especially necessary for the healthcare domain since there are too few security experts [74]. 20 of the papers compare several anomaly-based approaches to one another and assess the advantages and disadvantages of the approaches in the context of MCPS. E.g., Newaz et al. developed HealthGuard, which utilizes four ML-based detection techniques (Artificial Neural Network, Decision Tree, Random Forest, k-Nearest Neighbor) [89]. The researchers compare the algorithms in terms of accuracy, precision, recall, and F1-score (test accuracy considering precision and recall). 14 researchers combine different anomaly-based approaches to a new, amalgamated approach. While most state how their approach improves the anomaly detection, Kintzlinger et al. emphasize that their proposition of a combination of ML algorithms and statistical methods performs worse than the use of statistical methods alone [77].
Another repeatedly seized approach is Federated Learning (17) which researchers use to address the challenges of healthcare data privacy (e.g., Otoum et al. [92], Thapa et al. [117], Ferrag et al. [54]). It is a machine learning technique that has recently attracted much attention -not only in medical applications -because it protects data privacy. Other ML approaches often store data centrally without taking privacy-preserving measures. This turns these central data stores themselves into lucrative targets. In contrast, Federated Learning establishes a global learning platform that combines the knowledge of locally available models. The process of training an algorithm runs over separate decentralized models. Local datasets are used without revealing private data. Federated learning can thus preserve the training dataset on the devices so that the patient's data is not needed for training VOLUME 11, 2023 on the server side [93]. While several researchers include one network segment or a whole hospital in a local model, Gupta et al. propose a digital twin for each patient and train their local model on it. The advantage is that all collected data belong to a single patient, and the researchers can correlate more parameters [59].
Specification-based detection approaches are the second largest group, though by a large margin (9). We present three examples in the following: To detect maliciously acting devices instead of attacks, Mitchell and Chen devise a behavior specification-based approach. They define behavior rules and derive attack states from there. Subsequently, the researchers develop state machines. The authors promise this approach could detect unknown attacks while keeping the overhead and false positive rate low [130]. Refining this work, Abdulhammed et al. create a hardware approach (Field Programmable Gate Array (FPGA) chip) that employs behavior rules to detect anomalies [127]. Their approaches address the resource constraints of MCPS. Fang et al. also observed and analyzed the behavior of the monitored devices. They suggest a combined approach of fuzzy core vector machine and rough set (RS) as preprocessor (peculiarity: RS acts as a filter for apparent abnormal behavior) [129].
Exclusively signature-based approaches propose only four researchers in their publications. Meng [137].
Although signature-based detection may seem to be the least pursued approach, some researchers include it in a more general security strategy, combine it with another method (hybrid), or use it as a means of comparison. Dupont et al. wrote a protocol dissector for the IDS Forescout SilentDefense [140]. Magomedov recommends a signaturebased approach for identified DICOM vulnerabilities [6]. Nguyen et al. designed a secure logger for medical devices with some detection capabilities. It consists of a dongle attached to the medical device that sends data to a remote cloud. The detection component focuses on packet or sequence tampering. Contrarily, the researchers consider compromised medical devices or devices sending compromised logs out of scope [145]. Radoglou-Grammatikis et al. compare their ML-based approach to Suricata loaded with attack signatures of Cisco Talos for the IEC 60 870-5-104 protocol. The signature-based approach performs better than most anomaly-based solutions presented in their work [97].

B. RQ 2-DATA COLLECTION AND PROCESSING LOCATIONS
Researchers choose different locations and thus varying data sources for their IDSs. We differentiated between the device, network, and cloud/cloudlet locations and combinations of two out of those (figure 1). It is essential to state that we chose the location network if network traffic was seized, the location device if data was collected and processed on the device (e.g., log data), and the cloud if data was collected in or from cloud services.
Most researchers select the network as the sole location for their approach (52%). While a majority chooses a classical IP-based NIDS approach, some utilize particular circumstances of the healthcare sector or a specific MCPS. For instance, Gao and Thamilarasu propose a gateway device for an IMD and its programmer device. It acts as a man-inthe-middle and is supposed to detect attacks between those devices [56]. Mahler et al. developed an IDS specifically for a CT device. It intercepts traffic between the host pc and the device (on the can bus) [82].
The location cloud(-let) was chosen by eleven percent of researchers. This was often the case if researchers gathered health data from a manufacturer's cloud. Examples are Gupta et al. [59] and Newaz et al. [89], who correlate different vital signs of patients. The approach of Gupta et al. stood out since they took the first steps in matching network data and health data. From this, they assess the monitored user's behavior to detect abnormalities [58].
Similarly, eleven percent of researchers opted for a pure on-device approach. To meet the special requirements of the healthcare sector, many researchers focus on lower resource constraints while preserving the patients' privacy. A particular advantage of the location device is that researchers can benefit from the special conditions of hospitals and clinics. E.g., Ardito et al. tailor their approach to an electrocardiography (ECG) device and use its user interface to display warnings in case of an anomaly. Then, feedback is requested from the treating physician. In this way, the physician is warned as quickly as possible in dangerous situations, and in the event of a false positive result, the effects can be limited [34]. A disadvantage of the location is that to implement a HIDS on a medical device, most researchers rely on devicespecific knowledge or access to source code. This limits the transferability and scalability of the approach in many cases.
Adding combinations of the locations device & network (12%) and device & cloud (4%), the number of researchers who combine the on-device approach with another location (16%) is higher than the number of researchers who use a pure on-device approach (11%). Thereby, in total, 27% of researchers decided to include device-specific information in their detection approach. Astillo et al. present one example of a combination of device and cloud. In their federated learning approach, they collect the data directly from the Continuous Glucose Monitor and process it on the controller unit of the MCPS. To share the knowledge between similar setups (other diabetes management control systems), only a submodel is generated on the device and subsequently fused to a central model on a cloud server [44]. The approach of Mitchell and Chen is a combination of host-based and network-based detection. Every node in the network acquires a set of behavior rules and can monitor the behavior of its trusted peers. So every medical device is monitored while it is also part of the detection approach [130]. Meng et al. suggest to perform the detection on every node individually and recommend a blockchain as an exchange platform for necessary signatures and a list of blocked nodes. As every node could add signatures to the chain in this scenario, the authors propose a centralized trust management scheme [25].
Besides the aforementioned categories, we could also observe that some researchers neglect the location choice of data collection. Instead, they base their detection approach on existing datasets (further elaboration in section IV-E) in computing platforms and simulation environments such as Matlab and Simulink. In this case, the dataset dictates the collection location. Examples are: Akram et al. [36] and Begli et al. [138]. Others combine the toolboxes with different simulators or platforms. Chen et al. employ Matlab and a cloudlet mesh simulator to calculate and evaluate the optimal number of collaborating IDSs in their cloudlet mesh approach [139]. In contrast, other researchers embed the proposed detection approach in a holistic security concept for a realistic hospital environment and even consider hospital network specifics. One example is Lakka et al., who describe an incident management approach, complementing their swarm-based detection with signature-based detection and consolidating the data in a hospital SIEM. A layered model outlines what information is collected where, sent where, and processed where [143]. Khan et al. also consider how their approach could be rapidly deployed in many hospitals. To this end, they have developed a framework for deploying their approach as Infrastructure as a Service in the cloud and as Software as a Service in a hospital network [75].

C. RQ 3-INCLUDED FEATURES AND CHARACTERISTICS OF THE LEVERAGED DATA
In contrast to the IDS locations, we observed a higher variety in the examined features (figure 2). The majority of researchers base the detection on non-medical contextual information (50%), i.e., analyze technical data and transfer gained IDS-insights from other sectors to the medical sector. One often-used approach is the analysis of the network packet's contents. The medical sector is particularly interesting for ML-based detection approaches because of the FIGURE 2. The type of data from which the features and characteristics were obtained. In total, 114 papers were applicable for the evaluation. ''Combination'' consists of a mixture of two data types, where network flow data and contextual (non-medical) data make up 13%, metadata and contextual information 3%, and metadata and network flow data 1%.
Remarkable is the high count of papers using contextual medical information from which researchers derive indicators of an intrusion. 20% of papers base their detection approach solely on such information. Mitchell and Chen were the first to incorporate medical context and correlations for attack detection (e.g., one proposed rule for conspicuous behavior is if the pulse is above a certain threshold during an analgesic request of the patient) [29]. Siniosoglu et al. utilize medical data such as ECG and arterial blood pressure [44]. Newaz et al. relate health values from different devices and interpret the results. They hypothesize that an attack usually targets one device at a time and that a deteriorating state of health should simultaneously affect various measured health values. If only single values deviate, they infer that this data must have been manipulated. E.g., if the patient's oxygen level drops due to health reasons, her heart rate would naturally also decrease. So if only one of the values changes, the IDS will detect an anomaly and raise the alarm [59]. Hady et al. propose a packet comprehension functionality: Their models recognize the heart rate, respiration rate, systolic blood pressure, diastolic blood pressure, blood oxygen, and more from captured network traffic [61].
Others utilize network flow data (9%) or packet metadata (4%) and claim that this is more suitable than inspecting all packets. Besides the already mentioned data masses in the health sector, some researchers give additional reasons. Fernandez et al. argue, for example, that their primary motivation for using network flows is the more and more encrypted data sent over the network, due to which inspecting packets would be pointless [53].
Several researchers combine two types of features in their detection approach to identify attacks (17%). One example is the expansion from the field of disease classification to attack detection on medical data, as Haque   this approach and propose two supplementary models: one model to detect intrusions from network flow data and one model to detect anomalies from healthcare data [112]. Sehatbaksh et al. and Rao et al. follow entirely different approaches. The former propose to use the electromagnetic (EM) signals generated by the monitored medical devices during operation to distinguish between normal and malwareinfected MCPS [109]. The latter suggest monitoring system operations. They hypothesize that a malware infection is identifiable by monitoring processes and other system parameters (especially the execution time) of a medical device [100].

D. RQ 4-ADVERSARIAL FOCUS
Many researchers limit the applicability of their work by making assumptions about attack types, targets, and locations, among other things. We have investigated which defense scenarios the detection approaches focus on. Here, we differentiated between external threat actors, malicious insiders, attack scenarios utilizing malware, and approaches focusing on detecting more than one attack scenario (figure 3).
Most researchers concentrate on insider scenarios (37%) or the combination of internal as well as external threats (25%). 23% focused on the sole detection of external threats. 9% of the researchers centralize the detection of malware infections. It is essential to state that we also included such papers in the category of insiders that do not explicitly mention such a specific attack scenario as a limitation, but require an attacker to have access to the network or a device (e.g., the attacker is able to spoof a mac address or the drug dosage is monitored for manipulations). So an external attacker that has already compromised an MCPS and can be detected as late as he laterally moves in the network or interferes with the normal function of the MCPS, is considered an insider in our classification scheme. The behavior-based approach of Fang et al. contrasts this scheme, as it promises to defend against external attackers. The model that they call detecting illegal behavior (DIB) focuses on the detection of maliciously acting accounts and devices (e.g., accounts that have been taken over through shoulder-surfing attacks) [129]. As it is technically impossible to differentiate between such a compromised account and a real insider sending malicious commands, we decided to follow our definition. While most other behavior-based detection approaches concentrate on the detection of insiders, Mitchell and Chen additionally claim to be able to detect malware, as they estimate malware to change the behavior of an infected device as well [130].

E. RQ 5-DATASETS USED FOR VALIDATION
Researchers need data to train and test detection approaches (especially if they are ML-based). There is an additional benefit when multiple researchers use the same dataset, as the different detection methods become comparable. Choosing a dataset fitting the task is crucial since datasets are generated for specific purposes. We have organized the used datasets and -sources into two categories, each with two subcategories, as shown in figure 4. The first category includes approaches that utilize publicly available datasets. Here we differentiate between security and medical datasets. The second category deals with publications that have created their data themselves. While datasets from some approaches are available to the community, many remain unpublished.

1) PUBLICLY AVAILABLE DATASETS
In non-health domains, researchers usually train and test novel IDSs utilizing publicly available security datasets. Our analysis shows that 51% of the researchers also follow this approach. Figure 5 presents the different datasets. 8% of the researchers in this group utilize the KDDcup-99 dataset. This dataset was developed for the KDD-cup competition in the year 1999, whose goal was to develop an NIDS [146]. Several problems, such as redundant records, have been reported, and the successor, the NSL-KDD dataset, was released in 2009 [147]. 18% of the researchers in this group use this dataset. Both datasets contain several IT protocols such as HTTP, SMTP, and FTP. Among others, Khan et al. bemoan the deficiencies of missing the latest attack vectors in the NSL-KDD dataset [74]. The Canadian Institute for Cybersecurity (CIC) published several datasets promising to have more recent attacks resembling real-world data. Their top priority varies from dataset to dataset. In the 2017 data set, they provided realistic background traffic and simulated the behavior of 25 users [148]. In the 2018 dataset, they focused on insider attacks and provided system logs of every machine [149]. Most researchers relying on CIC During our analysis, we encountered 18 different datasets that were used 72 times. Some research groups used more than one dataset and were therefore assigned to more than one category. datasets in the examined works use the CIC IDS 2017 dataset (8%). It consists of raw packets in PCAP files as well as labeled flows [148]. However, none of the presented datasets comprise IoT or MCPS protocols. The University of New South Wales (UNSW) fill the gap with their datasets ToN-IoT [150] and Bot-IoT [151]. In these datasets, a real-world network environment containing both IT and IoT devices was mimicked. Besides the network traffic, the researchers from UNSW provide Windows and Linux audit traces and telemetry data for the IoT services. Their IoT testbed consisted of various devices, among others: a fridge, a garage door opener, a thermostat, a GPS tracker, a motion sensor, and a weather station [150]. MCPS, however, have not been included. The ToN-IoT dataset, by 24%, is the most used dataset, while the Bot-IoT dataset is employed by 8% of the researchers that rely on publicly available datasets.
Some researchers employ the medical sector only as motivation and ignore the discrepancy between real network traffic in hospitals and the datasets they choose to support their detection approach. Others point to the absence of MCPS traffic in the dataset and handle the inadequacy differently. For instance, Schneble and Thamilarasu explain that in the context of MCPS, two aspects are crucial to keeping the detection latency low: Feature selection and reducing the amount of data processed by the IDS. Hence, to test the effectiveness of their feature ranking and selection algorithm, they consult the MNIST digit recognition dataset. This dataset contains 60,000 handwritten digits. They choose this dataset, among other reasons, because of the large feature space and the easy access to the data [108]. In contrast, Ferrag et al. explicitly determine the MNIST dataset unsuitable for training and testing IDSs in the context of medical devices [13]. Hameed et al. state that their approach is only applicable for detecting MCPS in a real environment if it is properly adapted prior to deployment [63]. Tabassum et al. use the datasets KDDcup-99 and NSL-KDD and merge their selfgenerated IoT traffic to cope with the missing IoT traffic in named datasets. However, they do not explain which MCPS they employed for the generation [114].
Some researchers harness medical datasets containing patient and medical data to identify attacks from those datasets (7%). The most commonly used dataset is the MIMIC III dataset, which contains health-related data of forty thousand patients who received intensive care in a hospital in Israel [152]. 25% of the researchers in this group used this dataset. While the researchers found the specifics of the medical data particularly valuable for attack detection, the drawback is that none of these freely available medical datasets contain attacks. Therefore, alternative ways must be found here as well. One idea given by Siniosoglou et al. is to use two distinct datasets to train their neural network: A publicly available medical dataset, and the UNSW-NB intrusion detection dataset for network flow data [112].

2) SELF-GENERATED DATA(SETS)
Many researchers generate their own data(-sets) and work with that data without publishing it afterward (36%). While this results in the fact that subsequent studies cannot be compared to their work, the reasons given are manifold. On the one hand, this data often results from cooperation with hospitals and could reveal real patient data. One example is Boddy et al., who captured network traffic in a UK hospital and depict the complexity of the network infrastructure in a visualization approach [153]. Even if this data could be anonymized, many argue that hospitals prefer to be on the safe side and not risk the exposure of any patient data. On the other hand, the data might be especially suited to an approach or just randomly generated, as in the work of Mitchell and Chen. They generated random data following their devised state machine [130]. This data would have had no benefit for any other researcher, as their states are unique to their approach.
As we already addressed in section IV-B, researchers used computing and simulation environments to test their new attack detection algorithm on existing datasets. Another approach is to utilize simulators and frameworks to model an even more realistic MCPS environment. Among these approaches are those designed for a medical environment and those whose original purpose is different. Two examples of non-medical simulators are presented in the following. Meng et al. operate a publicly available tool to generate attacks on wireless networks. The attacks are not specifically adapted to MCPS environments [25]. Thamilarasu et al. employ Castalia, a simulator for WSN and WBANs, in several papers [26], [56], [116]. Such toolboxes and frameworks originally developed for other purposes have certain limitations regarding MCPS simulation. Therefore, several researchers adapt various open-source medical device simulators to their needs or implement their own medical device simulators. Astillo et al. operate the UVA/Padova Type 1 diabetes simulator that has been approved by the U.S. Food and Drug Administration (FDA) in their testbed and generated their test data with it. They also use an extended simulator version to induce artificial attacks [44]. Sehatbakhsh et al. leverage open-source code to deploy a syringe pump on various architectures. They found a buffer-overflow vulnerability in the syringe pump's source code that they were able to exploit [109]. Raiyat Aliabadi et al. employ OpenAPS, an open source Smart Artificial Pancreas [132]. They use fault injection as the source for unknown attacks.
Recent research efforts concentrate on the standardization of medical device inter-connectivity to address the heterogeneity of network protocols used by medical devices in hospitals mentioned in section I This is not only an IT security challenge. One project that has already made some progress is the community implementation of an integrated clinical environment, OpenICE. It provides a framework for the integration of medical devices into an integrated clinical environment (ICE). The developers even promise to be able to connect legacy devices to their ecosystem. For that, they developed adapters for those devices and a novel network protocol [154]. Some security researchers propose IDSs for networks based on OpenICE. Li et al. use OpenICE to simulate future medical devices and accomplish a data flow analysis in an OpenICE network [131]. Fernandez et al. analyze network flows of malware outbreaks in such environments [53].
To mimic real-world hospital conditions even better, many researchers employ actual medical devices in a testbed. Figure 6 shows the different devices. While the medical devices most used are blood pressure sensors (12%) a clear favorite could not be determined. Various research groups cover multiple devices. A protruding example is the testbed of Fang et al. which contains 21 different medical devices and a malicious access point to capture network traffic. From the device behavior, they derive 21 behavior rules. Instead of attacking the devices, they define operation rules for each device and specify some operation rules as normal and the remaining as abnormal behavior [129]. This way, no real attacks are conducted. Instead, some behavior is defined as malicious. The detection system of Kintzlinger et al. is explicitly designed for attacks directed at programmer devices for implantable cardioverter defibrillators. They cooperated with two cardiology experts from a university medical center to create malicious programmings [77]. Yan et al. analyzed a medical shoe with 99 sensors attached. It is designed to detect the instability and balance of patients. The researchers statistically correlate the data of the different sensors in a shoe and, thereupon, identify attacks using anomaly detection [123].
Similar to Fang et al. before, we observed that many research endeavors were conducted utilizing household IoT devices rather than medical devices for data collection [129]. One example is Gupta et al., who built a conjoined testbed consisting of medical devices such as pulse oximeters and smart home devices like a fridge and a door sensor [58]. The same phenomenon occurs in the field of malware detection. Since there is little to no research on MCPS-specific malware, those researchers investigating malware outbreaks in clinical environments fall back on existing malware samples. Some utilize IT malware (e.g. Chowdhury et al. [52]), others use malicious android APK packages and explain that there are many mobile devices using android in hospitals [40].
Only six percent of research groups publish their generated datasets of MCPS-specific traffic. Nguyen-An et al. create an IoT traffic generator named IoTTGen. Their focus is smart home IoT as well as biomedical IoT. They analyze the behavior of smart homes and medical devices to build templates for those devices. The generator also allows adding new devices, if the traffic patterns are known. To generate anomalous packets, they extract attack traffic traces from the Bot-IoT dataset and inject them into their generated data. Since the Bot-IoT dataset does not contain MCPS-specific attacks, this generator cannot generate such attacks either. And since one finding in this work is that significant differences between the traffic of smart home devices and medical devices exist [155], it stands to reason that traces of attacks will differ as well.
Three dataset developers have focused on specific protocols or technologies. Radoglou-Grammatikis et al. present a IEC 60 870-5-104 protocol dataset. It contains protocolspecific attacks. They use the IEC Testserver to deploy their MCPS devices without specifying what devices were modeled in detail [97]. Zubair et al. provide a Bluetooth-enabled medical device dataset. They record Bluetooth network traffic and generate flow data from such devices -explicitly excluding the collection of patients' exact health data. The attacks performed during the data recording are also Bluetooth-protocol-specific [126]. Hussain et al. focused on Message Queuing Telemetry Transport (MQTT) traffic and developed a tool for generating MQTT-based MCPS Traffic. The result is provided as an open-source dataset [70].
Two self-generated and published MCPS datasets have already been reused by other researchers: Ahmed et al. operate the Libelium Mysignals healthcare kit to generate their dataset. This kit provides a platform for the development of medical devices and eHealth applications. The researchers use three available health sensors in their testbed. Just their attacks are not medical protocol specific. Their ECU-IoHT dataset provides the recorded network packets in PCAP format, and network flows recorded using Argus [35].
Hady et al. record their testbed's network traffic, consisting of several small medical devices, and extract traffic flows and patient data from it. This recording is published under the name WUSTL-EHMS dataset. Among their carried-out attacks is the manipulation of medical data. Thereby, they directly integrated attacks on medical network traffic in their dataset, even if those are limited to spoofing and data modification from a MITM position [61]. In addition, their WUSTL-EHMS dataset is the dataset from the category of self-generated and published datasets in our survey that has been used most often by other researchers to detect attacks on MCPS (4).

F. RQ 6-ATTACK PREVENTION ENDEAVOURS
Attack detection is often closely associated with attack prevention since attackers and malware act fast, and a manual response often results in data loss or, in the case of the medical sector, patient harm. Therefore, a timely reaction is an often invoked point. Troublesome is that an automatic reaction based on a false positive might also harm a patient. One example is a higher-than-usual drug dosage given to a patient because of a life-threatening condition. If an IDS identifies this as an overdose attack and preventively interrupts medication delivery, the patient might die. Researchers have to consider these exceptional circumstances and come up with sector-specific solutions. Out of the 118 primary studies we reviewed, only 14 studies address attack prevention or mitigation. Others, such as Kumar et al, see the need for attack mitigation but choose to merely alert an administrator if an attack is detected and take no further action to thwart it [78].
Most preventive approaches (9) leverage software-defined networking (SDN). (Férnandez) Maimó et al. propose a decision and reaction module in their approach. It consists of a rule-based decision component and a reaction and notification component. Utilizing network function visualization and SDN, medical devices can be isolated automatically. They emphasize that their approach is not to prevent the actual attack but to reduce the reach of the attack by preventing the attacker or malware from accessing more devices [53]. This strategy is also chosen by most other SDN-based prevention and mitigation approaches.
Others concentrate on preventing a specific attack vector in a specific environment. E.g., Thapa et al. utilize mitigation SDN rules to react to ransomware spread in an ICE [117]. Similarly, Bassene and Gueye focus their work on detecting DDoS attacks against hospital networks. They as well propose the utilization of SDN [49], but unlike the previous, their approach excludes entire subnets to counter this specific attack type.
Radoglou-Grammatikis et al. are part of the few who discuss the potential consequences of automatic preventive measures. They draw attention to the fact that an attack might come from a device still used for legitimate health-related operations. Their notification and response module weighs this risk of causing higher costs for the healthcare organization against the threat of the detected attacks. Eventually, it decides whether to isolate the device via automatically generated and applied firewall rules, or simply to report it to an administrator and ultimately have that administrator make the mitigation decision [97].
One example of a non-SDN-based approach is MedMon. MedMon is placed in a man-in-the-middle position between a controller and a wirelessly-connected medical device. Attack prevention works by jamming the identified malicious wireless connection. Regarding the option of a false positive, MedMon can be operated in different modes. If a valid connection from the controller to the insulin pump in the example of the researchers is jammed, the patient can manually deactivate the jamming [134]. Since insulin delivery does not have to occur within seconds, a patient can usually react to a warning. Therefore, this approach might be suitable for this particular device type. Contrarily, it might not be suitable for medical devices with other preliminary requirements.
Another non-SDN-based, innovative approach by Rao et al. proposes a new resilient design for MCPS. They suggest to employ different operating modes in the MCPS architecture. Threat mitigation is realized by automatically changing the operating mode based on calculated risk values. In 2018, they presented the idea of a so-called multi-modal system in the form of a pacemaker [100], while in 2019, they expanded the idea to an insulin pump [101]. Carreon-Rascon refined it and added self-healing capabilities to the system. In addition to the operational modes and threat mitigation policies, selfhealing policies are proposed and linked to the tasks of the different modes. Once the steps of the active mode have been executed, it is possible for the MCPS to switch to the next lower-risk mode and execute the self-healing tasks coupled to this mode [51].

V. DISCUSSION AND IMPLICATIONS
Overall, many good reasons and motivations for intrusion detection in the medical sector have been published. It is clear that the healthcare sector is receiving special attention, and many argue that particular challenges require special solutions. In the following, we use the knowledge gained on the state of research to discuss obstacles and limitations of attack detection for MCPS. By deriving five future research VOLUME 11, 2023 topics (A-E), we would like to support prospective research projects to take a targeted direction.

A. CAPTURING THE HEALTH SECTOR-SPECIFIC REQUIREMENTS FOR ATTACK DETECTION
The results show that the majority of researchers sees the difficulties of attack detection in the healthcare domain as detecting attacks with a low false positive rate, with as little computing power as possible while preserving patient privacy. The, by far, most popular approach is anomaly detection, while only a few researchers discuss options, how understaffed IT security personnel could handle results containing false positives. Overall, we observed that many researchers assume different conditions and circumstances in MCPS environments. Especially, technical differences between healthcare networks and those in other sectors have received little attention in research to date. Therefore, we identify the need to determine the requirements to which detection approaches in the medical field must adapt.
One main difference is the use of medical device gateways, which connect medical devices to hospital networks. Such a setup could lead to difficulties since, for example, agent-based approaches could not be easily deployed to such architectures. This also applies to innovative approaches that operate by combining local and global predicates. If, for instance, in an intelligent agent-based approach, such an agent can investigate a suspiciously-acting device, how could legacy devices be included without involving all their manufacturers? Furthermore, it is unclear to the research community if medical devices use encryption. As illustrated in section IV-C, some researchers assume that much of the traffic in a hospital is encrypted and propose a flow-based approach in response to this assumption. In contrast, other researchers like Hady et al. derive patient data from captured network traffic and assume that this traffic is sent unencrypted from medical devices to servers [61]. However, performing deep packet inspection (DPI) in the case of encrypted traffic (e.g., by implementing application layer gateways to break up encryption) is aggravated in the MCPS environment, as security engineers cannot easily place certificates on medical devices. Approaches that rely on DPI must take that into account. As a first future research topic, we see the need for a comprehensive study of the unique constraints, technical characteristics, and challenges of hospital networks, along with connected medical devices, in contrast to other IT and OT environments.

B. CREATING MCPS-SPECIFIC ATTACK DETECTION DATASETS
Researchers deal with the scarce information situation differently. Some leverage the diversity typical for the healthcare sector solely for motivational reasons. Others do not place much emphasis on where and how data is collected. Instead, they use existing, often outdated datasets that do not fit the field or their motivation and ignore any differences. Again other researchers find 'creative' ways to replicate individual features of hospital networks and test specific parts of their approach. One example is the debatable use of the MNIST digit-recognition dataset to reenact the high feature dimensionality in the health sector. Either way, detection based on real health-specific protocols is rarely conducted, leading to limited portability of detection approaches from outside the medical domain. Moreover, it leads to uncertainty about whether the supposedly most suitable approach will be the best fit in a real hospital environment. ML-based IDSs must, most certainly, be retrained to prevent an increased false positive rate in a real-world environment. Furthermore, even the more recent datasets (e.g., CIC IDS 2017/2018) are often unsuitable for detecting attacks on medical devices since they do not contain IoT or IIoT traffic. Even if such datasets exist (e.g., TON-IoT or Bot-IoT), they might not be suitable for the health domain, as we saw that MCPS traffic significantly differs from other IoT traffic. There are several health-specific protocols (e.g., HL7 and DICOM [6], [140]), but only exceedingly few of these protocols are part of the datasets examined. Problematically, current adversaries are increasingly attacking application-layer protocols, as discussed by Hussain et al. [70]. Another factor not covered in current datasets is that different real-world attackers would behave differently. While several researchers, among other things, focus on detecting port scans or denial-of-service attacks, APT attackers would act much more stealthy. The first steps of recognizing the differences in the attacker's modus operandi were taken by Mitchell and Chen, who consider this fact with their attacker archetypes (reckless, opportunistic, and random attacker) [130]. Thamilarasu, too, takes the attacker's behavior's impact on the effectiveness of the detection into consideration [116]. However, these researchers use simulations for their distinctive environment. A dataset containing such characteristics has yet to be developed.
In addition to the uncertainty about the transferability to the real world, the lack of fitting datasets limits the comparability of the approaches. When comparing ML-based approaches, it is common to compare performance indicators such as accuracy, precision, and recall. Many of the papers have calculated and reported the corresponding values in their evaluation. However, we have refrained from correlating papers based on these metrics in this paper. On the one hand, this is because often, not the same datasets were used for training and testing of the individual algorithms so that, at most, a small group of algorithms could be compared to each other. On the other hand, often further assumptions were made about attackers, attacks used, or the granularity of the classification (as described in section IV-D). These assumptions limit the applicability of the algorithms to single-use cases and further reduce the comparability to other algorithms. Sharma et al. reacted to this and built a modular framework with a benchmarking suite. This could help future researchers to easily test their new detection algorithms and compare them directly to the work of other researchers [110]. But since this framework represents a novelty, the community must first accept it. Furthermore, this suite also relies on the existing datasets, and while it takes an important step for comparability, it is not a comprehensive solution to this concern.
We also observed a trend to utilize distributed and federated learning approaches. It is crucial to point out that these models have specific requirements for datasets and that current datasets do not fulfill them.
In conclusion, the lack of appropriate datasets is a major obstacle to developing attack detection in the healthcare sector. From our point of view, an attack detection dataset should include three things to be suitable for the health domain. It should: (a) incorporate health-specific protocols, (b) model different attacker behavior, and (c) be suitable for specific scenarios and techniques, such as distributed and federated learning. The generation of such MCPS-specific datasets has to be addressed in the future. Therefore, we proclaim it as the second future research topic.

C. ADVANCING STANDARDIZATION PROJECTS
In addition to capturing the current state and the characteristics of the healthcare sector and mapping that state into datasets, we see the need to get to the root of the increasing diversity and the individual technology that makes IT security in healthcare so difficult. Therefore, efforts to standardize the sector's digital infrastructure must be intensified. Initiatives such as OpenICE are commendable. However, OpenICE was developed without consideration of security [145]. Additionally, intrusion detection in OpenICEbased networks will not be easily transferable to real hospital networks, as the network protocols are unique to the OpenICE environment.
The urgent need for standardization also applies to the device level. While it seems unsurprising to find the majority of researchers choosing the network as the data collection location (corresponding to the insights into the heterogeneity of medical devices discussed in section I), many researchers are examining single medical devices and designing specially adopted host-based detection approaches. This suggests that despite the hurdles in this area, many researchers would like to take advantage of the insights that device data can provide. One example of the many possibilities is the implemented feedback reaction to a detected anomaly by Ardito et al. [34] discussed above. However, the heterogeneity of devices currently means that a separate solution is required for each type of device. That is why most researchers propose an HIDS focused on a single or few device types (e.g., a smart artificial pancreas [132], a smart-connected-pacemaker [100], or a diabetes management control system [44]). The transferability of these approaches to the real world and, in particular, scalability are problems that are often not addressed by researchers. Therefore, manufacturers should also incorporate security considerations into the development of devices. Device and log data has to be made accessible to security experts in a standardized way. We ascertain advancing standardization ventures, therefore, as the third future research topic. As explained, this applies to both the device level and the network level.

D. CONNECTING TECHNICAL AND MEDICAL DATA FOR ATTACK DETECTION
In addition to the technical aspects of healthcare networks, some researchers explored the potential of medical context in various forms for intrusion detection. The promised benefits were manifold. E.g., in the case of a syringe pump, medical context could provide insights into a too-high dose for a patient and thereby recognize not only technically novel attacks and malicious insiders but also simple mistakes of health personnel. However, any initial attacker efforts or intrusion attempts might go unnoticed in these approaches. An attacker is only discovered if a device is already compromised and she tries to manipulate the care process. We have presented initial approaches for combined detection based on network traffic and medical data. However, detection has taken place independently and based on unrelated data sources. The genuine and thorough integration of the medical context with the detection based on technical features and, thus, creating a holistic approach is the fourth future research topic.

E. DRIVING RISK-AWARE ATTACK PREVENTION
During this survey, we observed that automatic intrusion prevention in the medical sector is an area that is handled even more carefully than in other sectors. The reason lies in the high stakes at risk: Lives depend on the system's proper functioning. While in other domains, a quick shutdown of a system that is most likely compromised may be just the right response in the risk assessment, an MCPS might still provide life-sustaining measures despite a compromise. Thus, the reaction must be weighed quite differently in this domain. As presented in section IV-F, there are very few research groups that address prevention and mitigation at all. The vast majority of them use the isolation capabilities that their SDN approach provides. While this does not necessarily mitigate the attack, it can stop potential lateral movement. Especially in the context of malware (esp. ransomware), this can be very valuable. However, the implications for the further functioning of isolated medical devices are rarely considered. Other endeavors propose individual solutions for single medical devices. While these preventive approaches are often innovative, they are tailored to the specific device type, and their risk assessment is not (easily) transferable to other devices. A plausible example is the IDS for an insulin pump, which notifies its user in case of an anomaly. The user can override the preventive measures and thus correct a false detection if necessary. This procedure would be fatal in the case of a pacemaker, for example, because here, it is important to react very promptly to anomalies. If the user is consulted first, it is questionable whether she can respond in a timely manner (or at all).
Since attackers and malware make no distinction between hospitals and other targets, these considerations and difficulties must not lead to a neglect of prevention. Just as in other fields, a quick but well-thought-out response in the event of an attack is essential in the medical field. Hence, a fundamental discussion about intrusion prevention in the medical domain, the sector-specific requirements, and how it can succeed despite the high risks has to be conducted. We conclude that this is the fifth future research topic.

VI. CONCLUSION
Already in 2012, Clark and Fu denounced two challenges in the context of the security of medical devices: ''(1) computer security researchers seldom have access to real medical devices for experimentation, and (2) the computer security community is largely disjoint from the biomedical engineering community.'' [156] These challenges persist to this day.
In this paper, we conducted a structured literature review by following the guidelines of Kitchenham et al. We found the synonyms for MCPS to be manifold and many of the security terms to be used in other respects in the medical domain. Most researchers focused on an anomaly-based detection approach at the network layer. The detection of malicious insiders was the primary focus. Several researchers used publicly available datasets for training and testing their algorithms. Others criticized the lack of suitable datasets and developed testbeds consisting of various medical devices. While some medical devices were used by multiple research groups, we observed no clear preference. Based on the results, we identified five research gaps. We discussed why it is necessary to examine the special conditions of hospital networks, the MCPS deployed within them, and the contrasts to other IT and OT environments. Furthermore, we see an urgent need for the creation of MCPS-specific datasets. Only with these sets researchers can attribute to the requirements and the unique possibilities of the healthcare domain. Alongside this, we see the need to support and expand MCPS standardization projects. Moreover, the medical domain offers an excellent opportunity to fortify attack detection based on technical features with medical context, thereby creating a holistic approach. Last but not least, a fundamental discussion should be held about the challenges of intrusion prevention in the medical domain and how it can succeed despite the high risks. We are confident that by countering these challenges, IT security in hospitals can be enhanced, and patients' lives can be protected. STEFAN STEIN received the B.S. degree in project engineering from Baden-Württemberg Cooperative State University, in 2017, and the M.S. degree in IT security and forensics from the Wismar University of Applied Sciences, in 2019. Currently, he conducts research as a Doctoral Student with the Brandenburg University of Applied Sciences. In addition, he is also a Security Analyst with Gematik GmbH. His work focuses on IT security in healthcare, including vulnerability scanning, security monitoring, data protection, security incident assessment, and vulnerability and information security management.
MICHAEL PILGERMANN received the Ph.D. degree in information security from the University of South Wales, in 2006. After, he has gained 15 years of professional IT security experience in business and administration. Since 2021, he has been a Professor with the Brandenburg University of Applied Sciences, specializing in IT security. He researches on detection of cyber attacks when operating critical infrastructures.
THOMAS SCHRADER is a Professor of applied computer science, focusing on medical informatics with the Brandenburg University of Applied Sciences. His research interests include data quality in medical data repositories, prospective risk analysis in medical environments, motion analysis, evaluation of hyperspectral images, and e-health.