Smart Substation Communications and Cybersecurity: A Comprehensive Survey

Electrical grids generate, transport, distribute and deliver electrical power to consumers through a complex Critical Infrastructure which progressively shifted from an air-gaped to a connected architecture. Specifically, Smart Substations are important parts of Smart Grids, providing switching, transforming, monitoring, metering and protection functions to offer a safe, efficient and reliable distribution of electrical power to consumers. The evolution of electrical power grids was closely followed by the digitization of all its parts and improvements in communication and computing infrastructures, leading to an evolution towards digital smart substations with improved connectivity. However, connected smart substations are exposed to cyber threats which can result in blackouts and faults which may propagate in a chain reaction and damage electrical appliances connected across the electrical grid. This work organizes and offers a comprehensive review of architectural, communications and cybersecurity standards for smart substations, complemented by a threat landscape analysis and the presentation of a Defense-in-Depth strategy blueprint. Furthermore, this work examines several defense mechanisms documented in the literature, existing datasets, testbeds and evaluation methodologies, identifying the most relevant open issues which may guide and inspire future research work.

José Gaspar is with the University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, 3030-290 Coimbra, Portugal, and also with the Faculty of Applied Sciences, Macao Polytechnic University, Macau, SAR, China (e-mail: jahgaspar@dei.uc.pt).
Tiago Cruz and Paulo Simões are with the University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, 3030-290 Coimbra, Portugal (e-mail: tjcruz@ dei.uc.pt; psimoes@dei.uc.pt).
Chan-Tong Lam is with the MPU-UC Joint Research Laboratory in Advanced Technologies for Smart Cities, Faculty of Applied Sciences, Macao Polytechnic University, Macau, SAR, China (e-mail: ctlam@mpu.edu.mo).
Digital Object Identifier 10.1109/COMST.2023.3305468critical infrastructure sectors identified by the Department of Homeland Security (DHS) of the United States of America [1], such as communications, transportation, or water distribution systems, among others.Consequently, it should come as no surprise that attacks on power grids have outnumbered the combined number of cyberattacks recorded on all other critical sectors, accounting for 53% of cyberattacks [2] as of 2018.
Modern power grids constitute complex geographically distributed systems where substations play a vital role, providing switching, transforming, monitoring, metering, and protection functions for a safe, efficient and reliable delivery of electrical power to consumers, controlling and directing the flow of electrical power and ensuring system safety through a series of control and protection mechanisms.Substations constitute a vital part of the interdependency chain for many critical infrastructures [3], [4] -thus, a successful cyberattack against a substation may trigger a chain reaction with cascading effects throughout the electrical power grid, disrupting critical infrastructure services, with damaging effects on society, security, economy, public health, or safety.
This situation has been somehow aggravated by a paradigm shift towards the development of Smart Grids (SG), as part of the undergoing digital transformation process in the electrical sector.SGs improve the efficiency, reliability, and resiliency of electrical grids, allowing them to cope with increasing demands and with the challenges introduced by a shift from a unidirectional centralized architecture to a bidirectional decentralized electrical power generation model.
Substations are not immune to this trend, having evolved to become cyber-physical SG components.Specifically, Smart substations (SS), also known as Digital Substations, are the result of a modernisation effort aiming at reducing design complexity, improving equipment interoperability, increasing electromagnetic interference immunity and improving reliability and control, while simplifying commissioning procedures and reducing operational costs.
Yet, this evolution comes at a cost.Historically, electrical power grids and substations have long been operated by air-gapped Industrial Automation Control Systems (IACS), typically based on Supervisory Control and Automation Data Acquisition (SCADA) systems, limiting the exposure to cybercriminals and cyberattacks.As a consequence of the SG digital transformation process, there has been an undergoing transition from air-gaped to connected architectures, increasing the exposure of the electrical sector to cybersecurity threats.SS, as core components of the power grid, are not exempt from this, having naturally become potential targets of cybercriminals aiming to cause severe service disruptions.
The reasons for this situation are manifold.Most communication protocols developed in the past were not designed with security measures in mind as they were very specific and applied in niche domains, leading to an assumption that a "security by obscurity" approach would ensure protocol safety.However, the establishment of SS reference designs based on standards such as IEC 61850 [5] made SS communication protocols and information models public.In addition, the information exchanged within networks became financially meaningful with the liberalization of the electrical market and the introduction of variable demand-dependent tariffs.More than ever, attacks against SS can have potentially devastating effects on critical sectors with severe damaging effects on safety and economies, thus becoming an object of concern for governments, operators and standardisation bodies.
This paper oversees and organizes relevant works published in the past years, identifying the most relevant aspects of smart Substation Automation Systems (SAS) communications and cybersecurity, also presenting a one-stop comprehensive and structured overview of the most relevant achievements and knowledge produced in SS security, as part of a wider SG infrastructure.Moreover, it identifies the cybersecurity requirements and reviews relevant system/network architectures, standards, vulnerabilities, and intrusion detection and prevention systems, complemented by the presentation of a Defense-in-Depth strategy for SS.Its main contributions can be summarized as follows: • Provide an overview of the SS cybersecurity landscape.
• Review the most relevant SS cybersecurity standards.
• Review existing datasets and testbeds.
• Review recent publications and research efforts on intrusion detection for SS.• Propose and present a Defense-in-Depth strategy for SS.
• Identify open issues and future research directions.Overall, this paper intends to provide a comprehensive communications-and cybersecurity-centric perspective on substations, with a particular focus on smart substations.It aims at providing a one-stop reference for anyone starting to delve into the topic, providing a perspective on SS which is focused both on their internal architecture, but also on their role as part of the grid infrastructure.The rest of this paper is organized as shown in Table I.

II. SURVEY SCOPE, CRITERIA AND RELATED WORK
Recent surveys focus on specific topics of SS and SG cybersecurity domains.This work, instead, presents a comprehensive and encompassing overview of the SS cybersecurity domain, focused on the following key aspects: (i) technologies, protocols and concepts, (ii) cybersecurity objectives and requirements, (iii) attack surface characterization (iv) detection techniques, (v) datasets and testbeds, (vi) SS and SS cybersecurity standards, and (vii) proposition of a defense-in-depth strategy.
This survey comprehends a selected set of publications from 2016 to 2023, selected from IEEE Xplore, ACM, Scopus, and Web of Science (WoS) databases, based on search results comprising the keywords: "SCADA", "intrusion detection", "network intrusion detection", "NIDS", "Supervisory Control and Data Acquisition", "IEC 61850", "IEC 62351".The results were sorted by relevance, according to each search engine's specific criteria, and further filtered based on the relevance of each publication in the context of this survey.

A. Related Work
Despite the critical roles of SSs within SGs, the potentially extensive damages caused by SS cyberattacks, and the cascading effects on SGs and other critical infrastructures, there is a scarcity of survey or review publications uniquely dedicated and focused on SS cybersecurity.Instead, most publications survey or review SGs [6], [7], [8], [9], [10], [11] and SCADA systems [12], [13], [14] focusing on different topics, but only occasionally covering aspects related to SSs.
Table II lists a few high-ranked surveys related to SSs, and summarizes the key topics surveyed.
Liberati et al. [18] present a systematic and quantitative review on the basic working principles of cyber-physical attacks against SGs, identifying existing vulnerabilities and dynamical properties exploited by attacks.
Quincozes et al. [19] present an in-depth survey on Intrusion Detection and Prevention Systems (IDPS) for SSs, suggesting a taxonomy of SS design and deployment aspects, including IDS architectures, detection approaches, analysis methods, types of actions, data sources, detection ranges, evaluation methods, and metrics.Further, the authors compile twenty-four detection rules deployed by state-of-the-art IDSs and assess their detection effectiveness on five types of cyberattacks: replay, naïve injection, IEC-61850 injection, masquerade, and Denial of Service (DoS).
On the other hand, Mathebula and Saha [20] focus the review and discussion on smart substations' cybersecurity challenges, including existing risks and threats from cyberattacks such as replay, DoS, and Man-in-The-Middle (MiTM), as well as cybersecurity challenges originated from network packets, computer viruses, network storms, and internal threats.The survey reviews the domain-specific cybersecurity requirements of SSs and cybersecurity protection measures based on digitally-signed Generic Object-oriented Substation Events (GOOSE) messages and TTL/SSL-encrypted MMS messages.A similar approach is followed by Ghiasi et al. [16], albeit focused on a wider SG scope covering different cyberattack models as well as suitable solutions and approaches to deal with them, providing an overview of future research directions.
A different approach is adopted by Hussain et al. [25], who review and discuss the reliability and availability of substation communication networks and review adequate dependability evaluation methods.The authors present and contrast different IEC 61850 substation communication network architectures, as well as seamless redundant zero-delay failure recovery protocols such as Parallel Redundancy Protocol (PRP) and Highly available Seamless Redundancy (HSR).Aftab et al. [21] provide a survey on substation equipment and IEC 61850 communications modeling efforts for performance assessment with different technologies, also envisioning cybersecurity applications.
A protocol-focused review is presented by Cai et al. [26], with an overview of IEC 62351 features to secure different IEC 61850 messages, including GOOSE, Sampled Measured Values (SMV), routable-GOOSE, routable-SV, and manufacturing message specification messages.The authors present an overall IEC 61850 overview and highlight intrinsic vulnerabilities and security requirements, challenges, and potential cyberattacks.The IEC 62351 standard is further presented along with security considerations to cope with IEC 61850 vulnerabilities.Similarly, Lázaro et al. [17] review SG and SSlevel communication vulnerabilities, discussing the application of the security measures advised by IEC 62351-6 considering the adoption of Time-Sensitive Networking (TSN) in the SG context.More recently, Silveira et al. [15] has also focused on IEC 61850 GOOSE security, providing an overview of attacks, mitigation methods, and existing gaps.
An alternative take on the subject is presented by Tan et al. [27], who focus on recent security vulnerabilities and data-driven security approaches within the entire SG data generation, acquisition, storage, and processing lifecycle.The comprehensive survey presents key aspects crucial to grid management such as State Estimation (SE) and Wide Area Management System (WAMS), which rely on phasor measurements, sampled by SS Phasor Measurement Units (PMU), and merged by Phasor Data Concentrators (PDC).
A related perspective is provided by Sundararajan et al. [23], who also acknowledge the importance of synchrophasor data, sampled at SSs, for real-time wide-area electrical gird monitoring and protection systems.The authors introduce a set of synchrophasor data quality attributes, their challenges, and evaluation strategies.Moreover, they identify the interdependencies between data quality and cybersecurity challenges at different levels and highlight how quality helps to identify cybersecurity issues.Also on a similar scope, Kumar et al. [24] offers an encompassing perspective on Smart Metering Infrastructures for SGs, including a threat analysis, elicitation of specific security requirements, existing approaches and open issues.
Zhang et al. [22] identify SS security requirements and shortcomings of IEC 62351, proposing a Certificateless Public Key Cryptography (CLPKC) security scheme which avoids certificate exchange latency of traditional certificate-based cryptosystems, and the key escrow problem of identify-based cryptosystems.The proposed method complies with the strict message timing requirements of SSs and prevents repudiation and replay attacks.
While the analysis of the aforementioned surveys provided a solid insight about the current landscape in terms of literature reviews, it also helped identify a notable gap: the absence of a source providing an encompassing perspective on the architectural, communications, cybersecurity and standardisation aspects of the SS domain and its surrounding context.This constitutes the main motivation for this paper: to cover the foundational knowledge necessary to develop a solid perspective on the communications and cybersecurity aspects of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the SS domain, also providing references to assist the process of delving into more detail regarding specific aspects.

III. SMART GRIDS
Electrical power grids are complex critical infrastructures that transport electricity from generation locations to consumers.Traditional electrical power grids adopt a top-down unidirectional power flow architecture which relies on bulk electricity generation facilities and a hierarchical structure of transport and distribution networks.However, this model has started to show its age, becoming increasingly unable to cope with the emerging energy consumption, production and even storage trends.For instance, demand in terms of electric energy consumption over the past 44 years has been on a steady rise, with worldwide electrical energy consumption increasing from 5268 TWh, in 1974, to 22315 TWh, in 2018 [28].This trend, which represents an average annual increase of 3.41%, along with a shift towards a distributed generation model, is one of the factors fuelling smart grid development.
The National Institute of Standards and Technology (NIST) defined a Smart Grid Architectural Model (SGAM) [29], presented in Figure 1, consisting of seven domains composed of interconnected systems and devices to deliver electrical power to consumers.
Smart grids encompass both electrical and communication networks, with the first being responsible for the transmission, and distribution of electricity to end consumers, and the latter being responsible for monitoring and controlling grid processes.Smart grids extend data communications to edge components such as (i) smart meters which collect real-time consumption data, and (ii) smart appliances which rely on grid data to schedule operations and balance electrical power production and consumption, minimizing losses and reducing storage requirements.
Table III compares traditional and smart grids, from a multidomain perspective.Smart grids improve the reliability and efficiency of electrical grids, managing and balancing electrical power generation and consumption by adopting a flexible (ADAPTED FROM [30], [31], [32]) network topology that enables bidirectional energy flow and distributed electricity generation, making it more sustainable, resilient, and efficient.By accommodating bidirectional data and electrical power flows, smart grids highlight the importance of prosumers, customers who both consume and produce sustainable energy, sharing their surplus with other grid users and thus playing a vital role in peak energy demand management through energy management systems.

IV. SMART SUBSTATIONS
Substations are important parts of an SG infrastructure and perform different functions, according to their location, configuration, operating voltages, and application.They are responsible for stepping up voltages to reduce transmission losses, stepping down high voltages to lower voltages suitable for distribution, protecting SG assets, and controlling electrical power flow to cope with supply and demand.
Substations adopt a variety of configurations using air, gas, or hybrid insulators.In addition, substations may operate on AC or DC voltages, at different levels such as Low Voltage (LV), Medium Voltage (MV), High Voltage (HV), and Extra High Voltage (EHV).In comparison with conventional grids designed for one-way electricity flow, SSs are designed with the SG bidirectional electricity and data flows in mind, also providing improved protection and monitoring capabilities.

A. Main Functions
As discussed next, smart substations transform, monitor, protect, meter, and control electrical power flow to offer a safe, efficient and reliable electricity supply to consumers.Their role encompasses several functions, namely: • Transformation: Grids operate at different voltage levels to improve electrical power transport efficiency.
Voltages are typically raised (step-up) to higher levels when transported over long distances, from generation to consumption locations.However, high voltages used in electrical power transport are unsafe and inappropriate for consumption.Hence, substations lower the incoming electrical voltage (step-down) in distribution facilities located near consumer endpoints.• Monitoring and Metering: SSs constantly monitor equipment state to ensure a resilient and efficient operation, providing uninterrupted service.Smart substation monitoring activities often consist of measuring and metering electrical currents and voltages at various substation sections to assess and maintain normal operation.Metering functions often include other measurements, such as temperatures, to assess the operating conditions of critical components.
• Control: SSs are equipped with control capabilities that enable local and remote command of the various components and equipment.Substation remote control functions offer a quick and convenient method to manage and control numerous SSs from a single remote location, allowing electrical power to be interrupted, resumed, or switched to different locations according to SG operational needs.• Protection: SS protection functions reliably and efficiently identify electrical system and equipment faults, triggering the necessary isolation and recovery actions to prevent damages or risking human safety.

B. Types and Roles
Smart substations play an essential role in SGs and are typically classified in four major types [33], according to their roles and applications: Switchyard substations are part of generating stations, which connect generators to the grid; Customer substations serve particular business customers based on their particular requirements; System substations transfer bulk electrical power across the grid with some only provide switching functions, whereas others also convert the incoming voltage; and, finally, Distribution substations are located close to consumption areas and distribute electrical power to most customers.

C. Architecture
Recent smart substations, based on the IEC 61850 standard, combine traditional power flow and modern data flow technologies in a cyber-physical system that adopts an architecture composed of station, bay, and process levels, connected by process and station communication buses, as shown in Figure 2. The process bus transfers data between devices connected to process and bay levels, whereas the station bus connects station and bay levels, enabling communications among devices and systems.
This multi-level topology, along with firewalls, unidirectional gateways (such as data diodes), and other network components, prevents cybercriminals from reaching critical bay and process-level cyber-physical devices.Moreover, the architecture copes with network-specific communication requirements and enables traffic segmentation, reducing congestion and preventing critical collision events in process-level communications.IV.

D. Functional Levels
The SS architecture is divided into station, bay, and process functional logic levels, as displayed in Figure 3, with specific roles, responsibilities, and logical interfaces listed in Table IV.
Next, we address in more detail each of these levels.
Station Level: Supervises, monitors, commands and controls all substation equipment and processes, either locally or from a remote location.The station level includes a Remote Terminal Unit (RTU) for remote control and a Programmable Logic Controller (PLC) for local control, as well as an engineering station and a Human-Machine Interface (HMI), for configuration and monitoring, respectively.Data exchanged between station components is carried out via logical interface Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Bay Level: Consists of protection and control devices (e.g., IEDs and PMUs) that directly monitor, protect, control, and record electrical processes.Communications within bay level functions are carried out via logical interface 3 whereas communications between bay and process levels are carried out via logical interfaces 4 and 5.Moreover, protection and control data are exchanged between bay and station levels via logical interfaces 1 and 6, respectively.Finally, different bays directly exchange data via logical interface 8, and remote protection data is exchanged with the bay level via logical interface 2.
Process Level: Consists of protection, switching, power transforming, metering, and electricity flow handling equipment.The equipment includes circuit breakers, switchgears, power transformers, as well as current and voltage instrument transformers.The process level handles data acquisition from electrical processes and controls switches and other electrical devices (MUs, CBs, and IEDs) to protect the substation equipment from overloads or faults.Communications between process and bay level functions are carried out via logical interfaces 4 and 5.
Together, these functional levels provide a hierarchical model that layers process, monitoring, protection and control functions in a coeherent way, with logically defined interfaces between layers and functional blocks.

E. Physical System Components
Smart substations combine cyber and physical systems, as illustrated in Figure 4. Physical systems consist of electrical parts, equipment, and devices responsible for transporting, switching, transforming, isolating, and protecting the electrical power system.The physical systems include buses (BUSn) that carry electrical power along the substation to various components, including: (i) voltage arresters which discharge voltage spikes and prevent damaging devices connected to the network, (ii) circuit breakers (BRKn), which interrupt sections of the electrical grid, (iii) power transformers (TR1) which step-up or step-down voltages, (iv) current transformers (labelled CTn), which sample bus current flows, (v) voltage transformers, which sample bus voltages, (vi) capacitor banks which manage power-factor corrections, (vii) switches (SWn) which interrupt electric network sections, and (viii) reclosers which interrupt the current flow during anomalous events, reclosing the circuits multiple times to assess and recover from temporary faults.

F. Communication Buses
The SS architecture, displayed in Figure 2, relies on two communication buses to exchange data between components of the same or adjacent functional levels, according to specific communication requirements.
First, the Station Bus provides a local high-speed fiberoptic network which connects bay level IEDs and station level devices, allowing peer-to-peer messaging for protection and layered mapping (TCP/IP) for data acquisition and control.
The Process Bus is a high-speed fiber-optic digital communication network which connects process and bay level equipment, enabling high-speed current and voltage Sampled Measured Values (SMV) multicast messages over Ethernet, as well as the exchange of monitoring and control signals.The process bus reduces interconnection wires, improves safety, and simplifies commissioning operations.The bus complies with strict transfer time requirements of time-critical services such as protection control trip commands, as well as Current Transformer (CT) and Voltage Transformer (VT) sampled measured values SMV.The introduction of a process bus greatly simplified the connections between bay-level protection and control and process-level devices, replacing legacy parallel copper wire connections with a single fiber serial communication bus.
The communication protocols, defined in IEC 61850, are detailed in Section X of this manuscript.

G. Cyber System Components
This subsection will cover the cyber systems (devices and networks) providing reliable metering, monitoring, protection, and control functions within SSs, presented next.
Merging Units (MU): are industrial electronic devices that merge voltage and current measurements, sampled simultaneously from instrument current and voltage transformers, into standard IEC 61850 sample value messages sent to protection devices.
Intelligent Electronic Devices (IED): are industrial microprocessor-based controllers which automate SSs by processing input sensor and power equipment data and outputting proper control commands to SS power components.IEDs may perform a wide range of protection, monitoring, and control functions including protective relaying, circuit breaker control, capacitor bank switch control, recloser control, tap change control, and voltage regulation.Recent IEC 61850-compliant IEDs offer interoperability, advanced communication capabilities, improved reliability, and reduced deployment complexity.
Remote Terminal Unis (RTU): are intelligent microprocessor-controlled industrial devices equipped with communication capabilities and input/output interfaces that connect to the physical world through sensors and actuators, enabling SS remote control, monitoring, and telemetry functions.
Programmable Logic Controllers (PLC): are modular microprocessor-based electronic devices equipped with inputs and outputs, which execute programmable logic.PLCs were originally designed to replace complex wired logic performed by switches, relays, and other electromechanical components.However, PLCs have evolved and acquired complex control and communication capabilities, becoming part of complex Industrial Control Systems (ICS).
Phasor Measurement Units (PMU): are electronic devices that measure current and voltage phasors, frequency, and the Rate of Change of Frequency (ROCOF) of electrical power lines, coping with the existing SCADA system's limited sampling rate and restricted voltage and current amplitude measurements.Phase measurements carry substantial substation and grid operation state information.They are often aggregated with amplitude and phase measurements collected from other substations to enable Wide-Area Monitoring and Protection System (WAMPS) or Wide Area Monitoring, Protection and Control (WAMPAC) functions.PMUs rely on accurate timing information and synchronization across the electrical grid.
Figure 5 illustrates current and voltage instrumentation transformers which collect analog samples from electric power lines, and a PMU which converts the measurements into timestamped synchronized phasors (synchrophasors) based on a Global Positioning System (GPS) atomic clock source.
Figure 6 depicts a block diagram of a PMU.Current and voltage transformers provide power line analog measurement samples which are converted to a digital format according to IEEE C37.118 [35], [36], [37], [38] protocol standard, at a sampling frequency synchronized by an atomic clock reference.The synchronization of voltage and current measurements is crucial for wide-area measurement processing systems.

H. SS Function Virtualization: The Road Ahead?
The SS concept should not be considered a static construct, but rather a blueprint geared towards evolution.One such  example is the possibility of a service and equipment consolidation trend in SS and SG systems, pretty much in line with what happened in the IT world, with the introduction of virtualized protection and control components [39] and/or networks, to improve flexibility, resiliency and protection.This direction had already been hinted at in previous work focused on virtualizing automation equipment using commodity hardware [40], but gained traction in recent years with an emerging body of SS and SG related research.
For instance, [41] introduced the Grid Function Virtualization concept, which is focused on the deployment and management of SG functions in virtualized environments.Also, [42] researched the possibility of virtualizing power grid elements using a single or redundant centralised control and protection unit architecture, with [43] also presenting several design considerations regarding the virtualization of substation automation functions.However, most of these efforts lack a detailed exploration of virtualization technologies and in-context evaluation efforts focused on IEC 61850 requirement compliance.
Other proposals include [44], which introduces the Virtual Protection Relay concept, also discussing aspects related to protection function distribution, redundancy and synchronization -however, the proposed implementation and evaluation does not address the associated virtualization aspects, being based on proprietary frameworks.The use of Software-Defined Networking (SDN) and containers is proposed in [45], as part of a co-simulation framework for of IEC 61850 communications between vIEDs.Reference [46] analyzes the use of SDN in IEC 61850-based substation automation.The Centralized Protection and Control concept, presented in [47], [48], [49], constitutes one of the most realistic efforts towards SS virtualization, covering the issues surrounding the consolidation of mixed-criticality workloads (hard-RT and general-purpose), the ability so implement virtual IEDs (vIEDS) in commodity x86 hardware, support for network determinism and implementation of orquestration (this latter aspect being frequently overlooked in other studies).
Nevertheless, and despite the significant changes brought by a possible functional SS consolidation trend, existing evidence shows that it will most likely happen within the scope of the already established communications standards, requirements and interoperability characteristics.

I. Smart Substation Standards
Smart substation design and operation are governed by a set of standards to define (i) precise clock synchronization functions for data exchange among PMUs and PDCs, (ii) resilient high-availability automation networks, and (iii) communication networks and systems for power utility automation.A concise taxonomy of the SS standards, presented in Figure 7, reveals its three main functions.These functions will be discussed into detail in Sections V and VI, providing a multi-domain overview of the substation communications standards landscape, respectively focused in its internal and wide-area telecontrol/monitoring operation.

V. SMART SUBSTATION COMMUNICATIONS
A modern Smart Substation constitutes an evolved cyberphysical system supported by digital communications.This section will start by introducing the communications requirements, followed by a discussion of the most relevant standards for network redundancy and reliability, communications protocols and time synchronisation.

A. Message Performance Requirements
Smart substations host several devices working cooperatively and exchanging messages to provide an efficient, safe and reliable operation.Such messages have distinct transfer time requirements based on their cruciality to SS operation.Table V lists the SS message types and performance requirements defined in IEC 61850-5 [34].
The standard defines seven substation message types with maximum transfer times according to the application type.Trip commands are of critical importance and closely related to protection systems, thus imposing a strict maximum transfer time of just 3 ms (and zero-loss tolerance).Commands such as close, reclose, start, stop, block, unblock, trigger, release, and state change have a relaxed time transfer requirement of 20 ms.Medium-speed messages are less critical and tolerate longer transfer times, despite relying on accurate time tagging strategies.Low-speed time-tagged messages are used for slow control functions, events, set-point management, system data, and non-electrical measurements such as temperature.
Process bus raw data messages consist of continuous stream measurements from instrument transformers and transducers.File transfer functions are not critical to smart substation operation and have low time transfer requirements.However, data must be sent in limited-size blocks to prevent blocking or delaying other network communications.Time synchronization messages synchronize smart substation IEDs and have no specific time transfer requirements.Station bus raw data messages transfer control orders from HMI functions and have no specific time transfer requirements, despite requiring a higher security level.Smart substations rely on high-availability networks to prevent service interruptions and the severe consequences on SGs and stakeholders.Despite the numerous cybersecurity measures to improve availability, high-availability network architectures are often adopted as redundancy mechanisms to obtain a zero-delay recovery on single-point network failures.
Basic redundancy support is included in the Ethernet standard, which handles multiple paths between senders and receivers to improve availability.It relies on Spanning Tree Protocol (STP) [50] and Rapid Spanning Tree Protocol (RSTP) [51] to suppress LAN loops by locking redundant paths and unlocking them in a presence of faults.However, such reconfiguration is not instantaneous, compromising the availability of SS critical infrastructures.
In alternative, the IEC 62439 high-availability automation networks standard defines zero-delay recovery redundant protocols, essential for a safe and reliable SS operation.The standard consists of six parts (see Table VI), with the following subsections paying special attention to HSR and PRP.
Parallel Redundancy Protocol: The Parallel Redundancy Protocol (PRP) [52], [53], defined by the IEC 62439-3 clause 4 standard, uses a double-star Ethernet topology to offer a zero-delay packet delivery on single point network failures.The redundancy mechanism relies on a duplicated packet strategy on both LANs which are delivered to endpoints through dual independent interfaces.Duplicate packets are dropped before delivery, hence only one is sent and processed by upper layers.The approach is transparent to higher protocol layers, particularly to the application layer.The Ethernet frame is extended with redundancy control trailer bits.PRP offers zero-delay recovery from network failures, ensures continuous operation under failure conditions, limits network engineering costs through a static implementation, and improves trust over Ethernet.However, PRP requires redundant switching equipment and network transmission lines, as well as redundancy control trailer bits sent at the end of the message, forcing the message to be processed before a frame duplication can be determined.Figure 8 illustrates a PRP network and the packet flow from source to destination, through two LANs, where duplicated data packets are sent through different networks and dropped by the receiver based on redundancy control trailer bits.Single attachment nodes are attached to one network or Fig. 8. PRP network architecture [52], [53] (SAN -Single Attached Note; DANP -Doubly Attached Node using PRP).Fig. 9. HSR network architecture [52], [53] (SAN -Single Attached Note; DANH -Doubly Attached Node using HSR).
to both networks through a Redundant box (Redbox) which interfaces with both networks.
High-availability Seamless Redundancy (HSR): References [52], [53], defined by IEC 62439-3 clause 3 standard, offers zero-delay packet delivery on single point network failure.HSR operates on a ring network topology, with egress packets duplicated in both directions.HSR is implemented in hardware and is transparent to upper layers, with duplicate packets being discarded before delivery to application protocols,.
Figure 9 illustrates an HSR ring network and the flow of duplicated packets, sent in opposite ring directions, from the sender to the receiver.Single Attached Nodes (SAN) are connected through a Redundant box (Redbox), with different HSR ring networks being connected by a dedicated QuadBox containing two ports for each HSR ring network.
The adoption of PRP and HSR seamless redundancy protocols has a residual impact on network delays during normal operation.However, the effect of redundancy protocols becomes evident in networks with significant background traffic, reducing delays and improving the resiliency against background traffic, as assessed in [54].
Mocanu and Thiriet [55] provide a comprehensive investigation of real-time performance, vulnerabilities, and attack scenarios on the SVM protocol, based on PRP and HSR networks.The communication protocol, network topology, and impact on electrical protection functions are experimentally assessed using real devices in a hardware-in-the-loop setup.The experimental results reveal that the lack of real-time scheduling in HSR affects traffic jitter, which becomes relevant when several SMV messages share the HSR ring.The experiments also demonstrate the resilience of PRP network performance to a combination of message flows.Moreover, PRP networks proved to be more resilient against flood attacks than HSR when exposed to false data injection and Ethernet flood attacks.Nevertheless, PRP networks can be effectively protected by a carefully-planned network design.Moreover, HSR RedBoxes are vulnerable to flooding.Despite the vulnerabilities found, both field devices and the HSR RedBox recover after flooding attacks.

C. Communications Protocols and Models: IEC Std 61850 IEC 61850 -Communication Networks and Systems for
Power Utility Automation, is a global standard dedicated to substation automation, developed by the International Electrotechnical Commission (IEC) Technical Committee 57 (TC57).The standard was defined to support interoperability [56], allowing the interconnection of heterogeneous devices with different functionalities, from different vendors.In addition, it simplifies the network architecture, reducing the substation footprint and complexity, and adopting a language to centralize configurations of the numerous substation components, which helps streamlining configuration processes.It adopts a technology-neutral strategy to decouple the object model and related services, from communication technologies.
The standard, which is organized into ten parts (see Table VII), defines smart substation communication protocols, focusing on three main domains: (i) Communication, (ii) Data modeling, which defines a virtual model of the substation based on a predefined object-oriented description, and (iii) a Substation Configuration Language (SCL) to support the engineering process automation.Test procedures are further defined for device compliance assessment.The success and industry adoption of the standard has contributed to its further extension and development to other domains, such as wind turbines and hydroelectric plants.
IEC 61850 starts by overviewing and presenting a glossary of the terminology used, in parts 1 and 2 [57], [58], [59].Parts 3, 4, and 5 [34], [60], [61] identify and lay out general and specific functional requirements for substation communications.A Configuration Description Language (CDL) is presented and detailed in part 6 [62], describing SS communications and IED configurations.The basic communication structure of substation and feeder equipment is presented in part 7 [63], [64], [65], including principles and data models, Common Data Classes (CDC), Abstract Communication Service Interface (ACSI), as well as compatible logical nodes and data classes.Part 8-1 [66] specifies a mapping between abstract data objects and services, and Manufacturing Messaging Specifications (MMS).Parts 9-1 and 9-2 [67] define a mapping between sampled measured values (unidirectional point-to-point and bidirectional multipoint), and an Ethernet data frame.Finally, part 10 [68] specifies IEC 61850 implementation conformance testing procedures and measurement techniques.
The IEC 61850 standard offers extended benefits over legacy SASs, providing comprehensive and improved support for substation functions and communications, eased system evolution, as well as simpler specification, design, configuration, setup, and maintenance.The standard was designed with four main objectives in mind: 1) Interoperability.IEC 61850 smart substation protection functions are provided through a seamless integration of multivendor IED equipment, based on unified data structure definitions.2) Easy configuration.IEC 61850 smart substations adopt an XML-based SCL language to configure substations Fig. 10.IEC 61850 communication profiles [66].
and their devices.SCL enables information exchange between engineering tools and describes device models, the communication infrastructure, and relationships between the topology of substation components and corresponding SAS logical nodes defined in IEDs.3) Simple architecture.IEC 61850 smart substations simplify communication links, replacing numerous point-topoint copper cables with simpler high-speed fiber-optic digital communication links.In addition, smart substations adopt a hierarchical architecture with improved communication performance, appropriate for timecritical applications.High-speed fiber-optic digital communication links significantly reduce installation and maintenance costs, reduce manual configuration of selfdescribing devices, and improve protection capabilities that leverage data exchanged between devices over a process bus.4) Protection scheme enhancements and new capabilities.
IEC 61850 smart substation process bus exchanges information between devices and promotes coordinated protection schemes among bay-level IEDs.The process bus enables centralized backup protection systems and improved fault tolerance, using alternative devices in case of failures.The new features and advanced services provided by IEC 61850 introduce capabilities not offered by legacy protocols.The IEC 61850 standard defines five communication profiles: (i) ACSI semantic model, based on an object-oriented architecture which delivers client-server service interactions between applications and servers, (ii) SMV multicast, which provides an effective sampled measured value exchange over the process bus, (iii) GOOSE, which enables fast data exchange over the substation bus, (iv) GSSE, which provides an expedited substation-level status exchange, and (v) TimeSync, which provides a time synchronization service.The standard further defines Specific Communication Service Mappings (SCSM) and a project engineering workflow, including an XML-based substation configuration language [69].(SCL) which describes the smart substation automation system, configurations of IEDs and switchyard devices, and system parameters.The SCL file consists of four sections assigned to corresponding file types: • Substation.The substation section, defined by a System Specification Description (SSD) file, defines the substation Single Line Diagram (SLD), the relationships with logical nodes, and the location of logical nodes in physical devices.• Communication.The communication section, defined by a Substation Configuration Description (SCD) file, describes the substation, the communication links between IEDs, and the substation description.• IED.The IED section, defined by an IED Capability Description (ICD) file, defines IEDs' capabilities, configuration, and relationships between logical Nodes in different IEDs.• LN Type.The LN type section is defined by a Configured IED Description (CID) file, which describes the configuration of a specific IED based on the data object of a logical node instance.The substation configuration language offers a centralized description and configuration of electrical substation devices, improving the deployment, management, maintenance, and troubleshooting activities.
Data Model: The IEC 61850 standard introduces an object model approach to describe SS functions and services, providing information modeling and functional decomposition.Figure 11 presents IEC 61850 object model entities, data elements, and corresponding relationships.
Each physical device (IED) is modeled as one or more Logical Devices (LD).In addition, each Logical Device LD, consists of one or more Logical Nodes (LN) representing IED power system functions.Each Logical Node consists of data object elements, which include the actual measurement data.The types of data elements and the structure of the IEC 61850 object model are defined by the CDC specification [70].The physical device model offers an abstraction of service and data item definitions, making it protocol-independent, and allowing mapping to MMS, GOOSE and SMV protocols.
IEC 61850 defines three protocols for SS information exchange: MMS, GOOSE, and SMV, as further described in the following subsections.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Manufacturing Message Specification (MMS): this is an
Open System Interconnection (OSI) application layer standard.It is the most common IEC 61850 standard protocol, adopting a client-server communication mechanism to exchange messages between IED and SCADA systems, over the network.MMS communication users have distinct client and server roles.A client user makes the request, whereas the server user processes the request and replies to the request, when necessary.MMS communications are mostly used in automation and control functions and have no explicit timing requirements, allowing the use of security mechanisms.The communication service mapping to MMS, defined in IEC 61850-8-1 [66], maps the device model to an MMS variable object with a unique name and reference.The abstract definition of the device's services and data elements enables direct mapping of services and objects from IEC 61850 to MMS, as illustrated in Figure 12 mapping examples.
Generic Object-Oriented Substation Event (GOOSE): This protocol defines the rules involving the exchange of data between smart substation IEDs, over Ethernet.The GOOSE event-based protocol relies on a publisher-subscriber model, which enables publishers to periodically send event messages to subscribers without receiving specific requests for each event.The protocol does not provide message delivery confirmation mechanisms and relies on a message burst strategy to reduce message losses, as observed in the example presented in Figure 13.
Figure 14 reveals the GOOSE message Ethernet frame structure and further details the specific structure of the GOOSE PDU section.The topics assigned to published messages allow subscribers to filter unwanted topics.
Sampled Measured Values (SMV): this protocol relies on a publisher-subscriber model to efficiently exchange data between smart substation MUs and IEDs, over a network.SMV publishers send periodic messages with time intervals defined by the measured frequency and the number of Samples Per Period (SPP).Figure 15 reveals the SV message Ethernet frame structure and further details the specific structure of the SV Application Protocol Data Unit (APDU) section.Published messages are assigned with a topic, which allows subscribers to filter unwanted topic messages.The time interval depends on two factors: measured signal frequency and SPP.IEC 61850-9-2LE defines two SPP values of 80 and 256.For example, if the signal frequency is 50 Hz and SPP is set to 80, the sending time interval will be 1/50/80, or 250 s.

D. Time Synchronisation: IEEE Std 1588
Where classic substations had separate IRIG-B buses for timing, their modern counterparts support inline clock timestamping, supported by a clock hierarchy which is intertwined with the SS communications network.IEEE Std 1588 [71], [72] is a precision clock synchronization protocol for networked measurement and control systems.It defines a Precision Time Protocol (PTP) to provide the precise clock synchronization functions required by synchronous measurement and control of SS and SG systems, over communication networks.The standard relies on hardwareassisted time-stamping between physical and MAC layers, and considers various types of delays such as cable, switch latency, switch store and forward, switch queuing, encoding, and decoding delays.Voltage and current measures, collected by distributed substations, are time-stamped with an accurate synchronized clock and aggregated by PDCs and MUs, enabling wide-area electrical grid monitoring and protection functions.The protocol allows the synchronization of clocks with different precisions, resolutions, and stability factors, with a precise grandmaster clock typically obtained from a GPS.Each PTP domain is limited to one grandmaster clock, with redundancy achieved by a Best Master Clock algorithm (BMC).

VI. WIDE-AREA MONITORING, PROTECTION AND CONTROL
Electrical power systems are adopting advanced WAMPAC functions to improve the efficiency, resilience, and security of SGs, providing dynamic system response, recording and visualization functions, real-time electrical power system model validation and state estimation, detection of security and stability margins, wide-area protection and control, and optimization tools to maintain a stable system loading [73].
Specifically, SG power State Estimation (SE) builds on top of these wide area communication mechanisms, delivering extended monitoring capabilities by inferring the operational power state based on synchrophasor voltage and current measurements collected from multiple electrical power buses, providing valuable information for appropriate billing, and efficient and secure management of the power grid [74].In such contexts, data integrity represents a crucial security measure to ensure an accurate power state estimation and proper grid management [75].This section will start by introducing the synchrophasor system as one of the stepping stones of the WAMPAC infrastructure, followed by a discussion of the most relevant standards for phasor measurement data exchange and telecontrol, on wide area network scenarios.

A. Synchrophasor System
WAMPAC systems rely on synchrophasor system data to offer electric grid management over a wide geographical area.A synchrophasor system is composed of PMUs, PDCs and a synchrophasor archive, as illustrated in Figure 16.Timestamped phasor measurements sampled at each substation are merged by local PDCs and further merged at higher hierarchical levels by corporate and regional PDCs.Synchrophasors provide essential data required by wide-area measurement systems which increase the electrical grid situation awareness.
PMUs, installed in SSs, collect high-rate electrical voltage and current phase measurements, sending them to PDCs according to IEEE C37.188 standard.PDCs align incoming PMU measurement messages from various substations, based on accurate time information, and aggregate them into a single synchronized data stream required to perform basic data quality checks.PDC synchrophasor data may be stored locally, remotely, with centralized or distributed storage techniques, as well as further transmitted to a remote application for additional data processing and analytic knowledge extraction.
Synchrophasor systems are designed to meet scalability, retention, accessibility, network, and analytic requirements.Such systems may include hundreds or even thousands of PMUs and may need to scale as the electrical power grid evolves.Data generated by numerous PMUs must be retained in local or remote storage for auditing purposes.Data access must be limited to authorized actors and encrypted for security and privacy.Inter-component communications must ensure data integrity, privacy, and security.
Synchrophasor systems must be protected by security controls [76] and assessed by vulnerability tests comprising the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. IEEE Std C37.118: From Synchrophasors to WAMPAC
Synchrophasor and WAMPAC systems rely on synchronization technologies to synchronize electrical measurements sampled from different SSs located across a wide geographical area.The IEEE C37.118 [35], [36], [37], [38] standard specifies message formats and communication protocols to exchange synchronized phasor measurement data among PMUs, PDCs, and other applications.Introduced in 2005, C37.118 [35] specified phasor measurements and communications requirements.It was improved in 2011 with transient performance requirements, and split in two parts: IEEE C37.118.1 [36], and IEEE C37.118.2 [38].The first part was later superseded by IEEE C37.118.1a[37], which updated PMU performance requirements.IEEE Std C37.118.1 defines a synchronized phasor (synchrophasor), frequency, and ROCOF measurements, as well as corresponding time tag and synchronization requirements for substations, along with requirements for measurement evaluation and compliance.IEEE C37.118.1 defines two performance classes for PMU applications: (i) a higher-accuracy measurement (M) performance class with slower response time, and (ii) a lower-accuracy protection (P) performance class for faster response time, suitable for real-time protection and control applications.IEEE Std C37.118.2 documents the synchrophasor communication requirements for data exchange among PMUs, PDCs, and application devices.It describes message types, contents, data formats, and use, without referring to communication mediums and transport protocols.The standard is divided in six clauses listed in Table VIII.
Transport, modes and message types: Synchrophasor measurement data is either exchanged over a connection-oriented TCP protocol or a connectionless UDP protocol.TCP is the preferred protocol for applications requiring highly-reliable bidirectional data transfers but tolerating relatively longer transmission times, whereas UDP is the preferred protocol for applications requiring fast and efficient unidirectional data transmission and tolerating occasional data losses.Despite the occasional data losses of UDP protocol, it is often preferred for synchrophasor measurement data exchange due to faster transmission, smaller overhead, and continuous data streams.
Synchrophasor measurement data transfer is based on a client-server communication model, using one of two modes: • Commanded.Bidirectional communication mode where the receiver initiates a connection with the sender and controls the exchange of configuration and synchrophasor measurement data.The receiver sends requests to the sender, which replies with the requested data.• Spontaneous.Unidirectional communication mode in which the sender initiates the configuration and continuous synchrophasor measurement data exchange without any negotiation with the receiver.Synchrophasor measurement data may be sent using one of the following transmission schemes: • Unicast.Sender transmits to one receiver, in a single network domain.• Multicast.Sender transmits to many receivers, between different network domains.• Broadcast.Sender transmits to all receivers.Frame Structure and Message Types: IEEE C37.118.2 defines a data frame structure, presented in Figure 17.A message frame starts with a SYNC field which identifies the frame type and synchrophasor standard version number.Then, a FRAMESIZE field defines the total frame size.Next, an IDCODE field defines a unique frame identification number.Further, Second-Of-Century (SOC) and Fraction-of-a-Second (FRACSEC) fields define the measurement timestamp or transmission time.Finally, message contents are added and terminated with a CHK Cyclic Redundancy Check field.
Aditionally, IEEE C37.118.2 also defines four synchrophasor-related message types: • Data message (DATA).Exchanges synchrophasor measurement data consisting of a set of measurements, with one or more PMU measurement blocks, sampled at a specific time.Each PMU measurement block consists of time and measurement quality (STAT), estimated synchrophasors, system frequency, ROCOF, analog values, and digital status.[78] • Command message (CMD).Sends instructions from a receiver to a sender, in commanded mode.

C. IEC 61850-90-5: An Alternative to IEEE C37.118
Synchrophasors may adopt IEEE C37.118 or IEC 61850-90-5 communication networks.In fact, the IEC 61850 standard has been extended over the years, leading to the development of IEC 61850-90-5 [77] to describe mapping mechanisms from IEEE C37.118 to IEC 61850 GOOSE and SV protocols, considering synchronized power measurement over a wide-area network.
For this purpose, the Measurement Logical Node (LN MMXU) was extended with Data Objects (DOs) incorporating synchrophasor parameters such as and sampling rate and Rate of Change of Frequency (ROCOF).Transport is achieved by either using tunneling or allowing GOOSE to multicast over IP networks using the IGMPv3 protocol.These R-GOOSE (Routable GOOSE) messages are routed over layer-3 routers with UDP/IP headers.

D. Telecontrol and Protection: IEC 60870-5 and DNP3
IEC 60870-5 provides a set of standards defining systems for telecontrol (SCADA and power system automation)specifically, part 5 describes a profile for sending telecontrol messages between two systems.IEC 60870-5-101 and -104 [79] apply to telecontrol equipment and systems for controlling and monitoring geographically dispersed processes, as it is the case for the SG.IEC 60870-5-104, whose application layer is based on IEC 60870-5-101 (which is designed for serial lines), provides TCP/IP communication between control stations and substations, using transports other than serial lines (such as Ethernet).The most important benefit of IEC Another protocol used for this scope is specified by IEEE 1815, also known as the Distributed Network Protocol, version 3 (DNP3) [80], which offers the same basic functionality as 60870-5 but adds other features, such as security (2-pass controls), encapsulation of multiple data types in a single message to improve efficiency, also supporting an "Unsolicited" mode, where devices can report events without being polled, which is useful for low-rate polling setups.

VII. SS THREAT LANDSCAPE OVERVIEW
This section intends to provide an encompassing perspective about known incidents, attack categories, vulnerabilities, targets and security issues that affect SSs and, to a certain extent, also the associated SGs, as a whole.

A. The Energy Sector as a Prime Target
As of 2014, cyberattacks against the energy sector constituted the majority of reported cases, accounting for 79.32% of the total number of industrial control system cyber incidents [81], demonstrating the interest of cybercriminals in the sector.In 2015, a global state information security survey revealed a six-fold increase in the number of cyber events detected in 2014, with the energy sector accounting for 15.6% of the total number of incidents reported [82].
Also, the fact that attacks on such infrastructures often require skill and preparation doesn't seem to deter attackers.A detailed classification of 2431 cyber events, collected from the North American Industrial Classification System (NAICS), reveals a wide adoption of exploitive cyberattack strategies, as opposed to disruptive strategies which merely account for 30% of the analyzed cyber events [83].This reveals a significant effort towards scouting and information gathering efforts, which are key for the preparation of a successful attack.
Within the energy sector, substations constitute strategic targets due to their role in the complex multi-sector energy interdependency scenario, with attack numbers following the general trend [84], [85], with several major incidents, as shown in Table X and discussed next.
In 2003, a Slammer [86] worm cyberattack was launched against a nuclear power plant located in the United States, causing system damages and financial losses.In 2010, the Night Dragon [87] malware cyberattack collected sensitive industrial data to obtain competitive advantages, raising concerns over the impact of information and intellectual property theft in the electrical industry.It was also in 2010 that the Stuxnet [88] malware made headlines -while being unrelated to the energy sector, it constitutes a relevant milestone, which helped raise awareness about cyberattacks on industrial control systems, and the resulting consequences.
In 2011, the BlackEnergy [89] malware cyberattack campaign compromised numerous ICS components to steal information from targeted systems.Despite the hacking malware campaign, no attempt was made to activate the malware and damage, modify or disrupt the industrial control process, which could have resulted in the shutdown of numerous critical infrastructure components.
In 2012, Shamoon [90] malware cyberattacks were launched against energy companies in the Middle East, adopting a destructive approach by erasing and overwriting data stored in hard drives, disabling numerous workstations.In early 2014, an Energetic Bear [92] campaign successfully infiltrated computers and systems of over one thousand electrical sector companies, gaining unauthorized access to sensitive critical data and disrupting the electric power supply.Later, in 2015, the Black Energy 3 (BE3) [93] cyberattack was carried out against three Ukraine electric power distribution companies, obtaining unauthorized access to computers and SCADA systems.The coordinated attack successfully opened circuit breakers of seven substations, resulting in a blackout that affected over two hundred thousand people for over six hours.
In December 2016, CrashOverride malware impacted a single transmission level substation in Ukraine.Further analysis revealed that it included several functionalities at proof-ofconcept level, using the OPC protocol to map the environment and select targets, also targeting HMI libraries and configuration files, also attempting to connect to Internet-connected locations [94].
Also in 2016, the Industroyer malware targeted a Ukrainian electricity substation, causing a power outage incident affecting at least one-fifth of Kyiv [95].Later on, in 2022 (2 months before the Russian invasion of Ukraine), a second Industroyer variant (Industroyer2) was found to be targeting regional electrical substations, having been detected and mitigated before causing incidents [96].
Such incidents can only be prevented and/or mitigated by introducing protection mechanisms addressing the specific SS attack surface, to be characterised in the next subsection.

B. Attack Surface and Targets
Accordingly to Stallings [97], the four fundamental security requirements in any system are: Confidentiality -information access protection; Integrity -protection against information modification or theft; Availability -preserving service availability and operational status; and Non-repudiation -assurance that someone cannot deny the validity of something or the authorship of an action.However, it should be noted that different domains diverge when it comes to prioritise security requirements: for IT confidentiality comes first, followed by integrity and availability, while OT systems invert these priorities, placing availability first [98], [99].
By definition, cyberattacks can be considered as actions aiming at compromising any of the 4 fundamental security requirements.Thus, it should come as no surprise that designing a comprehensive cybersecurity strategy requires assessing the exposure level of the infrastructure, while considering domain-specific needs and constrains, to achieve a balance between exposure minimisation and maintenance of functional system integrity.
SS cyberattacks often attempt to disrupt or manipulate substation data, processes, or communication flows by exploiting weak or compromised credentials, system misconfiguration, missing or poor encryption methods, trust relationships, phishing methods, or even by resorting to malicious insiders.Such attacks may focus on a wide range of targets, from configurations and devices to measurement data, and even encompassing attacks on sensors or actuators.The affected contexts can encompass the following: Physical: Physical attacks are conducted to cause physical damage to SS components such as hardware actuators or sensor components connected, with potentially devastating results.As an example, a cyberattack may deliberately cause a trip action and eventually interrupt the electrical power supply, causing a widespread blackout, with serious consequences.Data obtained from physical electrical properties governed by the network topology and laws of physics can be leveraged to achieve effective intrusion detection in electrical grids and SSs [100], [101], supporting the validation of data collected from networks and host process activities.
Network: Substation communications encompass channels and network equipment.Network attacks disrupt or degrade communication network operation without modifying data frames.Typical network attacks exploit flooding techniques, such as DoS and DDoS, to compromise network communications, delaying or preventing data exchange between SS equipment.
Protocol: Protocol attacks target vulnerabilities of communication protocols, listening or modifying data contents to disrupt or degrade the normal SS operation.Smart substations are commonly based on SCADA systems whose components communicate over networks using widely adopted protocols, including Modbus, DNP3, IEC 60870-5-101, IEC 60870-5-104, IEEE C37.118 or IEC 61850.
Control: Control attacks modify configurations or software running at intelligent control systems and devices such as PLCs and IEDs, which collect sensor data and control various substation components, disrupting normal operation or causing permanent damage.Such attacks may target components such as: (i) protection and control IEDs software/firmware, (ii) network equipment software/firmware, and (iii) computers, HMI, engineering stations, and test station software.
This overview confirms something that IEC 62351-1 [102] explicitly suggests, namely that: security represents a collection of issues that are hugely sophisticated and spread over different dimensions.Security field cannot be disintegrated into smaller, more manageable portions in a standard and clear way.The diversified nature of the SS attack surface confirms this statement, in the sense that effective protection may involve the development of multi-layered strategies designed to take into account different domains, in articulation with effective policies and practices.

C. SS-Level Vulnerable Vectors
Jiwen and Shanmei [103] propose a comprehensive vulnerability assessment platform.The proposed method scans known vulnerabilities based on a simulation of cyberattacks and identifies system and protocol vulnerabilities based on a protocol fuzz test process.Experimental results, conducted on a test SS, revealed that many engineering stations were exposed to high-risk known vulnerabilities and that few IEDs were mostly exposed to low-risk vulnerabilities.Moreover, various unknown IED vulnerabilities were identified at bay and process levels.Chattopadhyay et al. [104] also provides a similar overview on IED vulnerabilities, showcasing how a IEDs attacked by fault injection attacks and infected by a hardware trojans can compromise power grid integrity and availability.Finally, Wright and Wolthusen [105] identify the minimum capabilities required by an adversary to successfully inject a malicious message into an IED and cause undesirable actions, and test the effectiveness of various countermeasures.
Various types of cyberattacks on IEC 61850 SS networks are identified in [106], along with attack scenarios and the impact on SS operation.GOOSE Protocol vulnerabilities are assessed in [107], based on analytical approaches and experimental results, with and without the measures introduced by IEC 62351 security schemes (cf.Section X-C).The results demonstrate that GOOSE communications show security vulnerabilities and may be targeted by cyberattacks even if compliant with the IEC 62351 security standard, requiring additional intrusion detection and prevention solutions to deter system design security vulnerabilities.This is confirmed by other works documenting a series of poisoning [108], [109], replay, tampering and MiTM attacks on GOOSE and SV messages [100], [110].Ashraf et al. [111] also analises the impact of DoS attacks in IEC SS architectures, covering several DoS categories evaluated on a OPNET-based scenario, targeting servers, HMIs and IEDs, also studying jitter effects.
GOOSE data and command messages, sent across the smart substation, rely on strict real-time timing requirements, hampering the implementation of security mechanisms.Elbez et al. [112] review the security recommendations proposed by IEC62351-6 and identify related vulnerabilities in respect to GOOSE messages, presenting and assessing the performance of authentication schemes to secure GOOSE messages.The results demonstrate that the RSASSA-PSS probabilistic signature scheme does not comply with the strict timing requirements of the SS process bus.As an alternative, the authors propose a security method based on symmetric authentication mechanisms, such as HMAC-SHA256, with reduced computational time meeting GOOSE strict timing requirements.Vaidya et al. [113] also proposed a multilevel multi-factor authentication and attribute-based authorization mechanism based on a PKI infrastructure and ensured by zero-knowledge protocol-based server-aided verification and access control mechanisms, using a Substation Controller to help IEDs authenticate any remote users.
In an attempt to secure GOOSE messages, Boakye-Boateng and Lashkari [114] proved the possibility of using the uncrackable One-Time Pads (OTP) encryption technique to encrypt and decrypt GOOSE messages.
The importance of clock synchronization on measurements, events, and system states imposes an accurate detection of delay attacks even on encrypted clock synchronization traffic, with focus on PTP security.Annessi et al. [115] analyze the protection of encrypted high-precision clock synchronization protocols against delay attacks and successfully use statistical traffic analysis to identify selective message delay attacks.

D. WAMPAC and WAN-Level Vulnerable Vectors
Khan et al. [116] characterized the embedded features of IEEE C37.118 and IEC 61850-90-5 synchrophasor communication frameworks, as well as the resources, network requirements, and resilience to cyberattacks.The study encompassed an analysis of the impact of different cyberattack types against IEEE C37.118-2 communication systems and synchrophasor applications, including reconnaissance, authentication/access, MiTM, replay or reflection, also recommend an efficient security mechanism with strong protection against cyberattacks.
Authors concluded that synchrophasor data communication security is mostly unaddressed in IEEE C37.118  Similar problems affect the 60870-5-101, -104 [118] and DNP3 [119] telecontrol protocols, which lack any sort of protection mechanisms, such as encryption or authentication.There are provisions to secure them within the IEC 62351 standard), which will be discussed in Section X.

E. Timing and Time Synchronization Security
Another issue that is of utmost relevance for SG protection has to do with timestamping integrity.The reliance on the Precision Time Protocol (PTP -IEEE 1588) exposes SGs to the inherent protocol vulnerabities that are characteristic of earlier versions, as discussed by [120], which also proposed a detection mechanism for fake timestamps.
Version 2.1 of the protocol introduced protection measures, including an authentication TLV and cryptographic Integrity Check Values (ICV) for integrity checking (using Hash-based Message Authentication Codes, which makes sense considering that latency is considered primal for time-sensitive applications) with minimal overhead, as shown by [121].However, it must be said that neither the authentication TLV, nor the support for external secure transport mechanisms can defend PTP v2.1 from delay attacks [122] -moreover, redundancy can also be defeated by simultaneous attacks against multiple systems, paths or grandmasters [123].

F. SS Threat Model
Figure 19 provides an overview of the SS threat model, identifying several potential cyberthreats, as consequence of the successful exploitation of the potential attack vectors and targets provided by the two past subsections, with Table XI identifying the figure tags.
Such potential vulnerabilities can be exploited to accomplish malicious objectives, being leveraged for several purposes: from deploying passive attacks conceived to disclose sensitive information or perform scouting activities

TABLE XI CYBER VULNERABILITY LIST
to implementing active attacks focused on service disruption/degradation.The first can be accomplished by compromising devices (for instance, using malware), to gather configuration or state information, or networks, in order to record and/or study protocol payloads via network trace capture or MiTM attacks.Active attacks, on the other hand, may attempt to disrupt or degrade the availability of system resources, by resorting to DoS and Distributed Denial of Service (DDoS) attacks originated from multiple sources, eventually blocking or severely impacting SS operation.Alternatively, such attacks can also attempt to delay, falsify, spoof and/or compromise command, clock or telemetry integrity, by means of techniques such as network trace replaying, endpoint malware infection, compromising of strategic network points or by performing MiTM attacks between communication peers.
As part of the threat model development, a STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) [124] threat categorization was established, which is depicted in Figure 20.The STRIDE threat classification provides a way to organize the threats previously identified into 6 categories, each one being the violation of a desirable security property, namely (and by the same order as the STRIDE categories): Authenticity, Integrity, Non-repudiation, Confidentiality, Availability, and Authorization.It should be noted that elements of this framework are often intertwined together: for instance, when it comes to repudiation it must be considered that a spoofed account can help a malicious actor erase its footprints, by tampering with logs or deleting any other significant evidence of activity.
The SS thread model hereby presented will be next leveraged for two purposes: first, to provide a solid insight to better introduce the intrusion detection techniques and standardisation frameworks to be discussed in the next sections and, second, to provide one of the cornerstones for the defense-in-depth strategy to be presented in Section XI.

VIII. INTRUSION DETECTION FOR SS: A REVIEW
Over the past years, multiple intrusion detection methods for SS have been proposed in the literature.This section provides a comprehensive review of proposed solutions, organized along three different cybersecurity detection scopes: intra-SS, WAMPAC/horizontal/telecontrol communication and clock integrity levels.

A. Cyber-Detection Inside Smart Substations
This section presents a series of developments regarding cybersecurity detection methods focused on intra-SS communications, which are summarized in Table XII, being next discussed into more detail.While the presentation is organised Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
along five categories characterising the dominant technique being used, it should be pointed out that these are far from being mutually exclusive, with many approaches resorting to ensemble strategies.
Formal specification and state machine-based approaches: Bohara et al. [129] propose a method for real-time GOOSE poisoning attack detection on IEC 61850 SS protocol.This method extracts relevant network protocol features and uses a state machine to perform GOOSE message specification analysis and detect violations typically resulting from poisoning attacks.A comprehensive systematic evaluation, based on a synthetic dataset [139], demonstrates the ability of the proposed approach to accurately detect all message suppression and data poisoning attacks, in real-time.
Hong and Liu [131] propose a Collaborative Intrusion Detection System (CIDS) with individual IDS modules deployed and distributed across IEDs.The proposed CIDS adopts a distributed computing model to monitor each IED's host system and communication messages, collaborating with neighbor IEDs to accurately detect and determine the origin of cyberattacks.The distributed IDS model offers backup protection when individual IDSs are compromised, and extends substation protection coverage.A Message Authentication Code (MAC) is appended to communications between IEDs to prevent fabricated confirmations.Experiments, conducted on a hardware-in-the-loop testbed with a power system simulated by a real-time digital simulator, demonstrate the effectiveness and accuracy of the proposed CIDS in four realistic intrusion scenarios, without interrupting or delaying IED's protection functions.However, simultaneous attacks launched on both collaborative IDSs cannot be detected.
A GOOSE message spoofing attack detection method on IEC 61850 SS protocol is presented by da Silva and Coury [132].The approach adopts validation logic based on message fields and classifies GOOSE messages as (i) normal operation, (ii) event, and (iii) stop/reconfiguration. Messages not classified in any of the three known categories are classified as false and identified as spoofing attacks.Experimental results, conducted on dedicated hardware, demonstrate the ability of the proposed approach to accurately detect spoofed GOOSE messages.
Approaches based on physical properties and state prediction: Zhu et al. [126] propose an intrusion detection system for IEC 61850-based substations to identify attacks targeting MMS measurement messages exchanged in the station bus, between IEDs and the gateway.The proposed method cross-checks the consistency of MMS measurement messages in a distributed manner based on the law of physics of the electrical network.The expected measurements are determined based on Kirchhoff's Current Law (KCL), Kirchhoff's Voltage Law (KVL), and Ohm's law.Anomalous measurement messages are detected and discarded to prevent them from being used to manage substations and, eventually, the electrical grid.The proposed IDS was evaluated on a dedicated cyber-physical testbed consisting of a power system simulator, under four attack scenarios.The results reveal a maximum detection time of 150ms and a detection rate of 100% for packet rates up to 1000 per second, demonstrating the proposed IDS's ability to detect falsified measurements within a mixed data stream.
A SVM detection method for bad measurement values is proposed by Wu et al. [136], based on a Weighted Least Square (WLS) state prediction method.The method identifies electrical and physical connection substation topologies from the SSD configuration file, collects the states of circuit breakers and disconnectors, performs a data quality check based on IEC 61850-8 quality bits, and filters the inputs before computing an incidence H matrix used by a Linear WLS state estimator.The WLS state estimation is then used to detect and handle bad data.The proposed method relies on protection current transformers to provide alternative, less accurate measurements, to compensate for measurement errors caused by bad data.Experimental results, obtained from three bad data scenarios simulated on an IEEE 14 bus system, reveal high detection rates of 100%, 95%, and 80% for one, two, and three bad data scenarios, respectively.However, actual detection rates are slightly lower due to the method of bad data generation, which produces some bad data samples close to true data.
Macwan et al. [138] present a scheme based on an IEC 61850 substation with radial topology and Kirchhoff's voltage and current laws.The proposed scheme locates a limited number of malicious measurements from IEDs at different substation locations, resulting from data injection attacks, relay misconfigurations, or sensor failures.A collaborative mechanism detects and deters data injection attacks under various electrical grid operation scenarios.Experimental results, based on a hardware-in-the-loop setup with a substation Real-Time Data Simulator (RTDS) and real IEDs, demonstrate the ability of the proposed method to consistently detect and deter data injection and misconfiguration attacks, as well as sensor failures in IEC 61850 smart substations, under steady state and faulty conditions, for attacks injected at different locations.
Approaches based on statistical or time-series analysis: Elbez et al. [127] model network GOOSE communications in IEC 61850-based substations using an Auto-Regressive Fractionally Integrated Moving Average (ARFIMA) model.Two variants of a novel anomaly intrusion detection system are proposed to detect flooding attacks based on: (i) Generalized Likelihood Ratio Test (GLRT) statistical hypothesis testing, and (ii) Cumulative Sum (CUMSUM).The proposed IDS is tailored for specific GOOSE message features, such as message frequency, the number of received messages, and the timestamp of the most recent messages.The IDS was tested with 25 Monte-Carlo simulations performed under different threshold and noise levels.Experimental results reveal improved detection performances of GLRT and CUMSUM IDS variants compared with baselines, with average FPRs of 6.66% and 7.72% for GLRT and CUMSUM variants, respectively, and average FNRs of 8.12% and 7.4% for GLRT and CUMSUM variants, respectively.

DL and ML-based approaches:
A DDoS network attack detection method on IEC 61850 SS protocol is presented by da Silva and Coury [128].The proposed method starts by classifying GOOSE message traffic as normal, event, or DDoS attack, based on a method inspired in [132], and further relies on a Nonlinear Autoregressive Model with Exogenous Input Artificial Neural Network (NARX ANN), trained with authentic network traffic, to differentiate authentic network traffic from DDoS attacks.NARX ANN produces a low prediction error rate when presented with authentic network traffic and a high prediction error rate in the presence of DDoS attack network traffic.The proposed method relied on 62 prediction steps to achieve a relative error of up to 5%.Experimental results demonstrate the method's ability to determine electrical system operation signatures, which enable the detection of DDoS attacks in SSs.
Hariri et al. [100] present a comprehensive analysis of the IEC 61850 SMV protocol, its vulnerabilities, and related cyber threats such as DoS, eavesdropping, replay, MiTM and spoofing.The authors further review existing security approaches and investigate the feasibility of neural network predictors to classify legitimate and spoofed SMV messages.The proposed neural network predictor relies on a cyber defense layer to perform a structural validation of SMV messages, and a physical-based defense layer to assess SMV values based on the physical characteristics of the electrical network.The evaluation relies on a microgrid simulation model to generate datasets of legitimate SMV messages produced during normal grid operation, as well as numerous faults, transmission line loss, or power generator events.The proposed approach was tested on fake SMV packets generated by a malware development process.Simulation results reveal that, despite the high detection accuracy for spoofed data achieved by neural network predictors, the detection accuracy decreases with an increasing accumulation of prediction errors.The work was further extended in [101] with a detector of accumulation of prediction errors, based on lightweight statistical indicators selected experimentally.Additional experiments were conducted based on a testbed, demonstrating the correctness of the grid simulation model, and the ability of the proposed method to detect SMV spoofed messages in three distinct scenarios.
Lahza et al. [130] take a different approach and evaluate fifty-five suitable features for detecting DDoS attacks on GOOSE and MMS protocols.The extensive set of features found in the literature does not capture the periodic behavior of protocols.As a result, a three-step process is defined to select the window size, identify pre-identified conditions, and select candidate base features.The process leverages the features found and relies on domain knowledge to introduce seventeen new time-window-based GOOSE and MMS advanced features.The proposed feature set was comprehensively evaluated based on datasets containing normal operation live data packets captured from a utility, and attack data created during the initial pre-processing steps.Experimental results, based on the proposed feature set and a ten-fold cross-validation method, reveal accuracies of 99.99%, 99.83%, and 99.40% for Decision Tree (DT), Neural Network (NN), and Support Vector Machine (SVM) classifiers, respectively.Moreover, the validation of the proposed feature set on an unseen test dataset, reveals accuracies of 99.04%, 90.13%, and 99.43% for DT, NN, and SVM classifiers, respectively.The results demonstrate classification performance outperforming previous works.
Data collected from a real-world substation operation does not contain abnormal behaviors or attack traffic, preventing the simple use of traditional classifiers such as neural networks and support vector machines.Fu et al. [135] propose an IDS for substations to detect process-level network abnormal behaviors based on a One-Class Extreme Learning Machine (OC-ELM) classifier.The proposed method extracts feature vectors from GOOSE messages, selected based on field relationships and correlations between fields and network behaviors.The proposed OC-ELM classifier is trained with a real dataset collected from a substation under construction and compared with state-of-the-art algorithms such as Gaussian Distribution (GaussianDD), K-Nearest Neighbor Data Description (KNNDD), and Principal Component Analysis (PCA).Experimental results, conducted on a real dataset, reveal that OC-ELM outperforms state-ofthe-art algorithms, producing a false negative rate of 8.81%, which contrasts with 9.99%, 9.95%, and 10.00% false negative rates achieved by GaussianDD, KNNDD, and PCA, respectively.
El Hariri et al. [133] propose a detection system to identify fake SMV messages in IEC 61850 SSs, along with a time series neural network to predict lost SMV messages.The proposed detection system prevents and deters false data injection attacks.The neural network was trained with data generated from a simulated electrical network operating under grid-connected and isolated modes and subject to five types of faults applied on numerous locations of each transmission line.Evaluation experiments, conducted on an IEC 81650 SG testbed, offered good fake message detection performance, with only 0.56% false positive predictions, enhancing the protection system resiliency against fake measurements.Moreover, the neural network revealed an accurate prediction of lost SMV messages, with 99% data predictions showing errors under 1.5%.
As an extension of [133], El Hariri et al. [134] introduce an IDS based on neural networks with temporal dependencies to identify and mitigate false IEC 61850 sampled measured values, preventing and deterring false data injection attacks.A customizable malware generation script is presented to sniff and manipulate SMV messages and produce false positive predictions used as negative training instances to harden the intrusion detection neural network, enabling it to learn and adapt to smart attacks.The proposed malware generation script can serve as a tool to fine-tune event trigger thresholds and benchmark other IDS algorithms.Experimental results demonstrate the script's ability to attack the neural network [133], with a success rate of 13.75%.
Presekal et al., [125] presented a hybrid deep learning model of Graph Convolutional Long Short-Term Memory (GC-LSTM) and a deep convolutional network for time series classification-based anomaly detection.Their approach, named CyResGrid, scored showed an accuracy of 96.45%, F1 score of 65.03%, and G mean of 17.16%, with a false positive rate of 0.13%, for combined attack scenarios.For stealthy cyber attack scenarios, an F1 score of 2.32% and G mean score of 3.08% were attained.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE XIII COMPARISON OF INTRUSION DETECTION SYSTEMS' PROPERTIES FOR WAN, TELECONTROL OR HORIZONTAL COMMUNICATIONS DOMAINS
Other approaches: Yang et al. [137] propose an IDS for IEC61850-based smart substations, which adopts a comprehensive and effective multidimensional defense-in-depth approach consisting of access control detection, protocol whitelisting detection, model-based detection, and multiparameter-based detection mechanisms that combine physical knowledge, protocol specifications, and logical behaviors.Experimental results, performed on a realistic testbed along with data from a real SS, demonstrate the ability of the proposed IDS to consistently detect various attacks under three milliseconds, including malformed packets, DoS, ARP spoofing, and MiTM attacks.

B. Cyber-Detection for Telecontrol, Horizontal or WAMPAC Communications
Smart substations are integral parts of smart electrical power grids, exchanging data with control centers and even other substations.In this scope, communications between SSs and other smart electrical power grid components rely on protocols such as IEC 60870-5-104, IEC 61850-5-90 or IEEE C38.118, which may be exploited by cyber attackers.Table XIII lists recent methods which will be further reviewed in this section.
Panthi [140] propose an IDS, based on six machine learning techniques, to classify the causes of disturbance with a particular focus on DDoS attacks.The proposed system can differentiate between natural, malicious, and non-malicious disturbances and was evaluated on four datasets created by the Mississippi State University.Experimental results show a maximum disturbance classification accuracy of over 80%, obtained from a Random Forest + Adaboost classifier, for two, three, and multi-class dataset instances.
Egger et al. [145] test and compare four intrusion detection methods on the IEC 60870-5-104 asynchronous communication protocol: supervised, semi-supervised, unsupervised, and signature-based detection.Evaluation was based on a dataset captured from an Australian Power Grid AG (APG) test environment consisting of normal operation and attack traffic corresponding to four attack categories.
Experimental results reveal that a signature-based anomaly detection method performs well on normal traffic as well as on some attacks, but fails to classify specific attacks, revealing an overall detection accuracy of 98.42% and class accuracies of 100%, 100%, 99.4%, 29.2%, and 6.8% for normal traffic, Nmap attacks, Syn Flood attacks, Nessus attacks and Fuzzy attacks, respectively.
A supervised intrusion detection method was evaluated on a medium size tree classifier, selected from a set of classifiers including tree classifiers of various sizes, Linear Discriminant Analysis (LDA), Support Vector Machine (SVM) with linear and Gaussian kernels, K-Nearest Neighbor (KNN) classifier, and tree ensembles with bagging and boosting.Experimental results revealed a perfect classification of normal traffic, as well as a small percentage of misclassified attacks, resulting in an overall precision of 100% and a recall of 99.62%.A classwise detection performance analysis further revealed class accuracies of 100%, 98.79%, 99.13%, 77.26%, and 97.65% for normal traffic, Nmap attacks, Syn Flood attacks, Nessus attacks and Fuzzy attacks, respectively.
A semi-supervised anomaly detection method adopted a one-class SVM with a Gaussian kernel to model normal traffic.This approach achieved an overall detection accuracy of 99.92% and class accuracy of 99.94%, and 99.44% for normal and attack traffic, respectively.Experimental results express the challenging nature of unsupervised classification methods with an overall detection accuracy of 95.30% and class accuracies of 97.41%, and 34.52% for normal and attack traffic, respectively.Among the evaluated methods, signature and supervised-based methods showed the lowest number of false positives, with unsupervised methods being the worst performers.Regarding false negatives, supervised and unsupervised methods respectively provided the lowest and highest results.
Kreimel et al. [146] present two anomaly detection methods, based on (i) machine learning and (ii) formal methods, to secure SCADA systems on substation domains based on IEC 60870-5 and IEC 61850 standards.The machine learning method relies on a three-layer perceptron architecture to accurately determine the traffic nature, based on network-based features selected according to the highest information gain.The formal method, based on networks of timed automata, is built directly from raw features and temporal logic, not requiring an extended amount of data to learn high-performance models.Experimental results, conducted on a real testbed, under normal operation as well as under MiTM filter, MiTM increment, MiTM drop, and DoS attacks, reveal a machine learning method average classification accuracy of 91%, which contrasts with an average classification accuracy of 99% obtained from formal classification methods, demonstrating the ability of formal methods to outperform machine learning counterparts.
Basumallik et al. [147] propose a CNN-based anomaly detector to identify False Data Injection Attacks (FDIA) against PMU-based state estimators.The proposed classifier extracts features from time series data, collected from PMU data packets aggregated in PDCs, and exploits the strong correlations of spatio-temporal changes in voltage and current measurements.Data injection attacks target a subset of sampled measurements, thus changing inherent correlations.The proposed CNN classifier is compared with RNN and LSTM deep learning algorithms, as well as with other traditional classifiers such as SVM and popular ensemble techniques such as Bagging, Boosting, and RUS-Boosting Trees.Experimental results confirm a 98.67% accuracy demonstrated by the proposed CNN-based classifier, contrasting with 91.18% and 83.18% accuracies obtained by RNN and LSTM deep learning classifiers respectively.Traditional machine learning classifiers revealed lower performances with Boosted Trees, obtaining a 93.78% accuracy when combined with correlation, 94.02% when combined with statistical parameters of wavelets, and 82.37% when combined with statistical parameters of PCA.Bagged Trees resulted in 93.56% accuracy when combined with mean of wavelet coefficients, whereas RUS-Boosted Trees resulted in 92.66% accuracy when combined with correlations.
Kreimel and Tavolato [148] present an anomaly detection system for electrical substations.The system captures real-time network packets of substation devices and extracts relevant features, selected based on information gain.The features selected include the Round Trip Time (RTT), packet length, packets per second, TCP window size, measurement values, and communication paths.Several classifiers were experimented, such as KNN, SVM, NB and NN, with a Multilayer Perceptron (MLP) neural network-based classifier being adopted due to its highest classification accuracy.Several network attacks were executed under supervision for evaluation purposes, including MiTM filter attack, MiTM increment attack, MiTM drop attack, and DoS attack.The neural network was trained with both normal and attack traffic, demonstrating an overall classification accuracy of 91.5%.However, the proposed system partially misclassified increment attacks, revealing 36% recall and 76% precision rates.The system was also tested on unsupervised network traffic, revealing, for some classes, confidences over 98%.However, the system failed to achieve high accuracy with incremental attacks, due to their stealthy nature.
Data produced by PMUs distributed across SSs provides a high-quality real-time perspective of the electrical grid operation, enabling area control error monitoring, event detection, early fault detection, and proper demand-response management, being also important to provide cyber situational awareness.Jiang et al. [149] present a comprehensive study on the detection of malicious spoofing attacks in PMU data streams, based on SVM and Artificial Neural Network (ANN) techniques.Features, selected based on Pearson Correlation Coefficient (PCC), include three strongly correlated features (negative sequence voltage, negative sequence phase angle, and frequency), and two moderately correlated features (negative sequence angle and zero sequence phase angle).Experimental results, based on PMU datasets obtained from Bonneville Power Administration (BPA) transmission network and Inter-university (OSU) distribution network, reveal spoofing attack detection accuracies of 98.21% and 94.07% for a two-class SVM classifier using BPA and OSU datasets, respectively, and detection accuracies of 98.47% and 94.24% for an ANN classifier using BPA and OSU datasets, respectively.The results indicate that both spoofing detection techniques are effective on transmission and distribution network domains.Finally, both methods offer a better False Discovery Rate (FDR) on the BPA dataset due to a lower transmission noise, when compared to distribution noise.
Phasor measurement units generate large data volumes which require specific large-scale data processing engines.Vimalkumar and Radhika [150] present a Big Data framework, implemented in Apache Spark open-source unified analytics engine for large-scale data processing.The framework consists of various machine learning classification techniques such as Deep Neural Network (DNN), SVM, DT, RF and NB, to detect intrusions in synchrophasor data.Supervised classifiers are trained on a synchrophasor dataset consisting of normal, attack, and disturbance operation classes, collected from a two-line, three-phase power system.Feature selection and dimensionality reduction algorithms are adopted to limit the set of features presented to the various machine learning techniques, discarding features with limited contribution to the detection performance.Experimental results reveal a superior accuracy of 79.86% obtained by a DNN from raw features, followed by 75.92%, 73.27%, 71.97% and 71.67% accuracies achieved by SVM, NB, DT, and RF, respectively, from raw features.The accuracy is generally penalized by a feature selection strategy, except for NB and RF classifiers, which improve the detection accuracy to 79.21% and 73.27%, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.respectively.Similarly, the accuracy is generally penalized by a PCA dimensionality reduction strategy, except for NB, which improves the detection accuracy to 77.87%.Nevertheless, a feature selection strategy reduces the prediction time, which is further reduced by a PCA dimensionality reduction strategy.

C. IEEE 1588 Clock Synchronization Protection
Clock synchronisation is of utmost importance for PMU and synchrophasor applications due to the fact that its integrity is vital to enable precise timestamping and phasor correlation, thus also being an important security vector which can be exploited to disrupt SG operation.Table XIV lists several security proposals for PTP security enhancement, covering detection as well as mitigation capabilities, next discussed.
Regarding PTP timing integrity, Alghamdi and Schukat [151] investigated a series of attack strategies outlined against PTP, in [122], focusing on the two most commonly used PTP daemons, PTP4l and PTPd, and highlighting the necessity of advanced attack detection methods.Reference [123] builds on their previous work and proposes a detection system that is aligned with IEEE 1588-2019 Annex P recommendations.The proposed system involves analyzing data gathered from all network slave clocks during the synchronization process and then monitoring for any anomalies caused by an attack.This approach can function on a device that lacks an exact time reference, such as an unsynchronized stand-alone device, or a dual clock device that contains both a PTP slave clock and an accurate unsynchronized local clock, such as a non PTP-synchronized device with an oven-controlled crystal oscillator, offering different levels of attack detection capabilities.Both Moussa et al. [152] also proposed a protocol extension to support a cyber attack detection mechanism capable of dealing with delay attacks was well as attacks against grandmaster, transparent and slave clocks.Moradi and Jahangir [153] proposed an attack detection algorithm that is able to detect delay attacks on Sync messages by comparing network clocks, in addition to attacks against transparent clocks and simultaneous attacks on the network.

D. Critical Analysis
The numerous works surveyed reveal a wide variety of promising intrusion detection methods, as well as the efforts to propose realistic solutions targeting resource-constrained hardware and compliant with the strict timing requirements imposed by IEC 61850.
When it comes to anomaly detection methods, four major families are identified: signature, sequence, rule and machinelearning based.However, it has been observed that machinelearning based approaches are gaining traction due to superior performance.Nevertheless, the different evaluation metrics adopted by each method, along with divergent evaluation datasets, prevent a fair and reliable performance comparison of the presented solutions.
Some works leverage the laws of physics governing electrical networks, to predict measurements and detect maliciously altered sample value messages [100], [126], [136], [138].The combination of network, electrical network and equipment operation data offers superior detection accuracy.However, these methods are effective mostly on the late stages of the cyber kill-chain, when an attack is producing tangible effects.Moreover, studies often disregard classification/detection latency, which is crucial for SS scenarios.

IX. DATASETS AND TESTBEDS
Datasets are of paramount importance to evaluate IDSs and train machine learning intrusion detection classifiers.The performance of signature and anomaly-based IDS solutions is strongly correlated with the quality and quantity of dataset instances.However, there is a chronic scarcity of real datasets collected from SSs and made publicly available to the research community, mostly due to the proprietary nature and potential sensitivity of SS and SG operations and data.As a result, researchers often collect datasets from custom testbeds or computer algorithms.

A. SS Network Communications Datasets
The access restrictions to live SS operation data resulted in different strategies to obtain representative datasets, often classified as follows.
• Real.Collected from live SS operations.The sensitive nature of the data produced by live SSs, as well as the risks of disrupting the normal operation of live electrical grids, makes real datasets difficult to obtain.• Testbeds.Collected from limited SS implementations to mimic a SS operation and produce realistic data.The size and complexity of testbeds is typically the main limiting factor that prevents researchers from obtaining  [154] propose a four-step methodology to generate anomaly detection datasets in ICSs, consisting of (i) selection of attacks, (ii) deployment of attacks, (iii) traffic capture, and (iv) computation of features.The methodology was adopted to generate an electric traction substation dataset used to successfuly train various machine learning anomaly detection models.
Table XV lists the few public datasets found in the electric grid and smart substation domains.The small number of datasets found suggests a reluctance to share real datasets and reflects the effort of building testbeds.

B. Testbeds
Access to live electrical sector operation data is restricted due to its sensitive and private nature.Testbeds offer an alternative solution to capture operational data from limited functional implementations which model real SSs.Table XVI lists recent testbeds found in the literature, including energy sector testbeds listed in [158] and [159].The large number of testbeds is due to a chronic scarcity of public datasets and the variety of test case scenarios used for research purposes.
The accurate evaluation of large-scale cybersecurity models is largely constrained by testbeds' physical resources.Testbeds are often limited to small-scale Hardware-In-the-Loop (HIL) designs and fully simulated test environments.An alternative solution is proposed by Ravikumar et al. [186], based on a careful design and modeling of IEC 61850 logical nodes in physical relays to simulate large-scale SG models in a HIL simulation setup.The proposed design emulates multiple circuit breakers in physical relays, extending their capacity up to 32 emulated circuit breakers with trip/close functionality.
Similarly, Rosa et al. [160] propose a complete IADS framework and relies on a large-scale HIL HEDva testbed, built by the Israel Electric Company, to assess vulnerabilities of different SCADA communication protocols and equipment, without the risk of damaging equipment or the electrical grid.The testbed includes power generation, transmission, distribution, and consumption functions, as well as SCADA servers, offering a hybrid combination of real SCADA components with emulated processes.

X. CYBERSECURITY STANDARDS OVERVIEW
The security regulatory frameworks for protection of SSs and SGs rely on a variety of security standards and guidelines to address organizational, process, and technical security risks, combining individual specific security requirements in a holistic security approach to deliver adequate protection.
This section presents the most relevant cybersecurity standards targeting SSs, summarized in Figure 21, based on IT and OT cross-domain cybersecurity requirements, OT crossdomain process-related industrial communication networks, and OT domain-specific standards to secure SS.

A. ISO/IEC Std 27001
The ISO/IEC 27001:2013 [187] standard details an Information Security Management System (ISMS) life-cycle, defining the requirements to establish, implement, maintain and improve organizational security, protecting information confidentiality, integrity, and availability in a systematic manner.It also defines security control requirements to assess and treat information security risks according to the organization's needs, following a set of intentionally generic requirements suiting different organization types, sizes, and operation models.The ISMS describes methods to operate the infrastructure and secure its information according to their Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The ISMS consists of a set of rules and guidelines for several contexts, namely: organizational, focused on identifying stakeholder requirements and the ISMS scope; leadership, for role/responsibility definition; planning, to identify opportunities and assessing risks to develop processes; support, to identify the resources needed to establish, maintain and improve the ISMS; operation, focused on implementing control security requirements, plans and assessment procedures; performance evaluation; and, finally, to implement corrective measures for continuous improvement.
In the specific scope of cybersecurity, for instance, many organizations have developed ISMS in accordance with ISO/IEC 27001/27002.Energy-specific infrastructures are also covered by ISO/IEC 27019 [189], which evolved from ISO/IEC 27002, being focused on the infosecurity operational aspects of the energy utility industry.

B. ISA/IEC Std 62443
The ISA/IEC 62443 standard (formerly ISA-99) defines the procedures to implement secure IT industrial communication networks for IACSs, being designated as an horizontal Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The first category (IEC 62443-1-1) introduces terminologies, concepts, and models required for a complete understanding of the standard.The second category (IEC 62443-2-1, IEC 62443-2-3, and IEC 62443-2-4) describes security policies and procedures applicable to asset owners, addressing various aspects of creating and maintaining effective IACS security strategies.
The third category (IEC 62443-3-1 and IEC 62443-3-3) describes system-level security design requirements and guidelines for system integrators, allowing a secure integration of control systems.Finally, the fourth category (IEC 62443-4-1 and IEC 62443-4-2) describes specific product development and component-level security requirements and guidelines for product vendors, to assist integrators and asset owners in the procurement of secure products.The latter also defines a series of functional requirements encompassing logging/audit aspects related to storage, protection or accessibility.

C. IEC Std 62351
The IEC 62351 series of standards were developed to add cybersecurity functions to a group of communication protocols including IEC 60870-5 series (also covering IEEE 1815 -DNP3, as a derivative standard), IEC 60870-6 series (ICCP), IEC 61850 series (including client-server, GOOSE, and SMV), as well as IEC 61970 and IEC 61968 (Common Information Model -CIM).It is divided as shown in Table XVIII.
The complex relationship between IEC TC57 communications and IEC 62351 security standards, represented in Figure 22, involves many aspects such as cryptography, key management or RBAC, among others.It is expected that further developments of this framework address the improved protection of protocols such as GOOSE or SV, whose strict timing requirements (multicast protection relay messages are sent every 3 milliseconds between controllers) rule out the possibility of using computationally intensive security mechanisms such as encryption.While IEC 62351-6 proposes signing packets using a RSA-Probabilistic Signature Scheme (RSA-PSS) to guarantee the integrity and authentication of GOOSE messages, a performance evaluation of various signature schemes conducted by Farooq et al. [199] proved that even the best performer (RSASSA-PKCS1-v1_5) does not meet GOOSE message strict timing requirements, creating a need for high-performance security schemes.

D. IEEE Std C37.240
IEEE C37.240 [201] defines a series of cybersecurity requirements for substation automation, protection, and control systems, also proposing several technical solutions to protect electric substation communication systems, albeit without providing detailed specifications.The standard imposes periodic security testing methods to verify the effectiveness of applied security controls, including penetration testing, vulnerability scanning, physical security audits, and reviews of security policies, procedures, and firewall rules.

E. IEEE Std 1686
IEEE 1686 [202] is an IEEE standard for intelligent electronic device cybersecurity capabilities.It defines IED features and functions to protect the electrical infrastructure, also defining measures to ensure secure access, operation, configuration, firmware revision, and data retrieval.

F. Compliance vs. Effectiveness: A Faux Trade-Off
The ongoing debate about whether regulatory and standards compliance should be privileged over more immediate and tangible measures is rooted in the somehow contradictory argument that compliance deviates attention from pressing matters.The problem seems to lie in the value perception that Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 22. Mapping of IEC 62351 cybersecurity standards to IEC TC57 power system communication standards [200].
standards compliance may bring to the cybersecurity domain, especially when their implementation is regarded as more of a bureaucratic procedure and less of a effective process.
However, it must not be forgotten that standards and regulations are built on past experience of incidents and successful defensive actions, providing a structured approach to embed these lessons in daily operations.Compliance guarantees the benefit of such knowledge to be shared among operators.For instance, standards such as ISO/IEC 27001, IEC 62443, and IEC 62351 can be seamlessly integrated to provide a solid security foundation, extending cybersecurity functions to multiple information system domains.
Ultimately, solutions, tools, standards and procedures are meaningless without strategy and guidance.Security should not be regarded as a band-aid patch, but rather as something intertwined in the infrastructure, from its early development stages, by involving security teams in SS and SG design.Thus, it's up to the security management team to steer the design, development, deployment and assessment of security mechanisms, working at the intersection point between the organizational, regulatory, technical and human/team domains, thus maximizing the benefits of these sinergies.

XI. WRAP UP: A SS DEFENSE-IN-DEPTH STRATEGY
This section covers the key aspects to be taken into account to develop a robust SS security strategy, providing an overview of resources, technologies and methods, as well as discussing their suitability for the SS domain.Due to the cyber-physical nature of the SS role, this strategy encompasses the physical, cyber and communications protection domains.

A. SS Domain-Specific Security Requirements
Generally speaking, SS ans SG share some of problems rooted into many OT security incidents, such as the widespread use of unprotected communications.But SG and SS add extra constrains to the mix, in terms of asset/service availability, communication latency, architectures, and contingency management strategies [203], [204].For instance, real-time requirements are crucial in SS, which are sensitive to timing disturbances that can cause delays or erroneous behaviors, impacting physical processes (this is the case for the strict SS message exchange performance requirements, which limit the use of inline Network IDS).Another example are the availability and service continuity requirements, which require that configuration changes or patches undergo a side effect analysis, as well as planned shutdown procedures.
The aforementioned factors have a decisive impact regarding the selection and deployment of security mechanisms.As an example, and despite the apparent natural advantage offered by Intrusion Prevention Systems (IPS), they are seldom deployed in smart substations due to the risk of the consequent damaging effects caused by automatic reaction to false positive detection events.Instead, IDS are often preferred to detect and report intrusions, which are further analyzed by humans assisted by cybersecurity tools and guidelines.Moreover, deploying inline or automatic reaction mechanisms should be avoided alltogether, as they can backfire, potentially (or even deliberately, if abused by an attacker) degrading or isolating network sections.

B. Defense Layers
Defense-in-Depth (DiD) are multilayered cybersecurity defense mechanisms that combine multiple methods and mechanisms to ensure data protection on every level.DiD encompasses security practices tailored to infrastructure layer, establishing multiple defense lines to offer additional protection in case of a single line failure.Thus, it contributes to reduce the attack surface and hamper threat progression, moreover if complemented by adequate personnel training to ensure preparedness in face of adverse events.Dazahra et al. [205] proposed a DiD multi-layer framework based on best practices to improve substation cybersecurity which, nevertheless, is insufficient for modern SSs. Figure 23 offers an updated take on the SS DiD strategy concept, covering the device/endpoint, network/protocol and physical process/infrastructure domains, organized along five security layers, detailed next.
Access control, filtering and segmentation/segregation: It is the role of the security policies to define roles, responsibilities and trust domains for people, facilities and infrastructures, for each SS security domain, as outlined by ISO/IEC 27002/19 and also NIST SP800-82 [206].
Network trust zones should be created, also enforcing network segmentation to create containment areas separated by firewalls (for instance: one zone for a DMZ mediating the electronic security perimeter of the SS; another for engineering, logging, SCADA workstations and process bus; and one zone for the field domain), configured to filter traffic, limiting access to strictly necessary services (as per by ISO/IEC 62443, ISO/IEC 27002/19 and NIST SP800-82).Data diodes [207] may be used to create unidirectional communication links between SS and control centres, also applicable for WAMPAC use cases.Regarding firewalls, IEC 62351-90-2 covers the application of Deep Packet Inspection for secure communications, with a state-of-the-art analysis.
For this purpose, deployment of Physical and Endpoint RBAC is key to restrict user actions based on their roles.with frequent credential rotation for premises, services and equipment.The principle of least privilege must drive the role to permission mapping process in access to devices, HMIs or engineering workstations.This is in line with: ISO/IEC 27001, which requires the implementation of a "need to know" principle with user access controls and authorization procedures outlined by ISO/IEC 27002/19 (as well as NIST SP800-82), to prevent unauthorized access to information or systems; IEC 62443-3-3, which requires a system-level technical solution for user authentication and authorization; IEC 62351-8, which describes an RBAC technical implementation to achieve interoperability; and IEEE C37.240.Physical infrastructure perimeters for SS bay or process zones must have physical barriers and controls must only be accessible after proper security clearance confirmation, as envisioned by IEEE C37.240 and ISO/IEC 27002/19.
Hardening, updates, change management: Proper security management must deploy preventive measures, hardening infrastructures, devices and services, in order to deter or hamper potential attackers at an early stage.
Endpoints and device configurations must be hardened, often implying deactivation of unnecessary services and limiting installed software on engineering and SCADA workstations, with authorized applications and services being limited to those strictly required, as suggested by ISO/IEC 27002/19.An updating and patching policy for embedded devices and workstations must be implemented, aligned with a change management strategy designed to minimize and mitigate the implied risks (as advised by NIST SP800-82 and ISO/IEC 27002/19), in articulation with ISO/IEC 27002/19 lifecycle management policies.For embedded devices, such as IEDs, adoption of signed firmware and user-defined payloads, protected bootloaders (supported by a hardware root-of-trust) and incorporation of glitch-proof and recovery mechanisms are key for proper hardening -this is covered by standards such as IEEE Std.1686.
For Networks and protocols, firewalls (OS-level and appliances) and network equipment must be kept updated.Strict use of encrypted protocols for operation, telecontrol and maintenance must be enforced (as per IEC 62351-3 to -9, with corresponding conformance tests specified by the ISO/IEC 62351-100 technical specification series), except for the cases were it might not be possible at all or adequate.For instance, the use of message authenticated codes is considered an adequate tradeoff to ensure integrity for GOOSE messages.In fact, IEC 62351-6 clearly states that "for applications [• • • ] requiring 3 ms response times, multicast configurations and low CPU overhead, encryption is not recommended" [208].In the unlikely case that IEC 60870-5-101 or DNP3 serial protocols are used, this can be solved by using gateways to provide conversion to TCP/IP variants, thus allowing the use of IEC 62351-3 profiles for TLS usage, or eventually adopting bump-in-the-wire encryption and authentication mechanisms.
Physical infrastructure perimeters for SS bay or process zones must have physical barriers and tamper-resistant access restriction and control mechanisms.
Surveillance, assessment, monitoring & auditing: The evolving threat landscape makes it necessary to continuously assess the threat surface and introduce corrective actions whether necessary, also monitoring equipment, network and infrastructure status for increased visibility and awareness, as required by NIST SP800-82 and ISO/IEC 27019.
For Endpoints and devices, Host IDS or anti-viruses can provide device-level monitoring -however, it should be noted that the former are often preferred due to the potential system latency overhead.Service log monitoring must be deployed for SCADA components, via log collectors (as specified by IEEE C37.240 and IEC 62351-14) and network management data models specified in IEC 62341-7.
For Network/protocols, traffic and log monitoring must be used, not only for internal SS communications, but also for remote telemaintenance accesses and WAMPAC links (as per IEC 62351-14 and 62351-90-3, also covered by ISO/IEC 27002/19 and NISP SP800-82).NIDS are key for this requirement.Nevertheless, special care must be taken in their deployment, to avoid introducing undesirable network delays or points of failure.Finally, Physical infrastructure surveillance and monitoring must be introduced, using CCTV and electronic access control systems, as per IEEE C37.240.
For all domains, Security Information and Event Management (SIEM) systems can provide a comprehensive infrastructure view, transcending the scope of specific security perimeters while acting as concentration points for acquired field evidence, from logs to data obtained via IEC 62351-7 compliant data sources, as suggested by NIST SP800-82 and ISO/IEC 27019.The same information can also be fed to policy/regulatory compliance audit mechanisms, in order to detect non-conformities that might constitute evidence of malicious activity or risky behaviour by internal personnel, also contributing to prevent security incidents by means of regular compliance assessment.
Multi-domain vulnerability assessment must be periodically performed using tools and frameworks adequately configured to comply with the specific nature of the SS ecosystem.Such tools must work hand-in-hand with asset and configuration management databases (required by IEEE C37.240), avoiding situations where scanners cannot provide more than a probability of a certain asset to be affected by a vulnerability, due to the impossibility of establishing device identity and configuration.Also, active vulnerability auditing procedures such as pentesting campaigns must be carefully planned and executed to avoid unforeseen effects, as mentioned by NIST SP800-82 and ISO/IEC 27002/19.
Finally, asset and software inventories must also provide support for continuous supply-chain assessment processes mandated by ISO/IEC 27002/19, for Software Bill of Materials (SBOM) and equipment, against policies, incident and vulnerability databases.
Response, mitigation and containment: Procedures and mechanisms for adequate response, mitigation and containment in face of adverse scenarios must be implemented, as mandated by ISO/IEC 27002/19.Also, and despite its more generic nature, NIST SP800-61 [209] and ISO 27035 [210] provide relevant guidelines for incident management (the latter focused on a Plan-Do-Check-Act-like process), many of which also apply to the SG and SS domains.Team planning and preparedness are key for such purposes, by means of advanced training and drills.On the technological side, Endpoint Detection and Response (EDR) and Security Orchestration and Response (SOAR) tools might be used, but care must be taken regarding the implementation of automatic response procedures -preference must be given to human-inthe-loop processes, avoiding the risk of an attacker abusing such mechanisms for deliberately isolate devices or network segments (NIST SP800-82 offers cautionary advice regarding this matter).
Other strategies, such as resorting to high or mediuminteraction honeypots providing SCADA station-alike service footprints or emulating IED device profiles may be a valuable resource to attract and contain threats, eventually as part of a moving target defense strategy, luring and engaging attackers while providing time for threat profiling and countermeasure deployment.
Disaster and incident recovery: Security management must also encompass disaster recovery plans, defining backup procedures and service restoration tasks to be undertaken in case of a successful attack.ISO/IEC 27002/19 provides complete domain-specific guidance regarding these aspects, with ISO/IEC 27035 and NIST SP800-61 also providing more generic, albeit valuable, orientations.Moreover, best practices for incident handling always encompass a "lessons learned" stage, that is only possible by means of post-mortem trace analysis, allowing experts to reconstruct the trail conducting to the root cause and generate valid and useful evidence.As such, monitoring and assessment mechanisms must also be complemented by "black box" repositories enabling digital forensics procedures to identify, collect, examine and analyse data while preserving the integrity of the information and maintaining a strict chain of custody for data.In the course of a forensic investigation, it should be assured that all available digital evidence is not modified without authorisation (also covered by ISO/IEC 27002/19).
Security management: Security management, instantiated in the form of the ISMS, defines the rules and procedures that shape the entire operator security posture, from the strategical to the tactical aspects, ensuring standards and regulatory compliance while taking care of other aspects such as fostering preparedness and managing technical debt (as it is the case for legacy systems or components near their end-of-life).For instance, it is up to security management to define a training roadmap including certification processes and drills to increase team preparedness, also focusing on the definition of appropriate playbooks and rulebooks for incident response, providing precise instructions.Proper and clear role definition as well as optimization of response procedures, communication channels and number of key persons involved are crucial to reduce incident response times [211].For this purpose, ISO/IEC 27001 offers a solid and comprehensive strategic foundation, with ISO/IEC 27002 and, particularly, ISO/IEC 27019 (which is domain-specific) providing tactical-level guidance, in complementarity with other standartization and regulatory frameworks.
Moreover, a proper security management strategy is key to deal with threats that cannot be addressed by isolated measures, as it is the case for compromised supply chains.Supply chain protection requires the adoption of a multi-domain structured approach encompassing aspects such as: the policy realm (as it is the case for logging and tracking shipments, use of locks and tamper-proof seals during shipping, requirement for employee background checks, resort to licensed auditors to certify third-parties, or the use of certified suppliers); the technological DiD framework, as part of the security assessment procedures or controls (such as RBAC for data and systems); and even team management (for instance, as part of training procedures).

C. DiD Development Using the Cyber Kill-Chain
While the previous subsection presented a DiD strategy to protect SS, little was said about which specific This model acknowledges that threats undergo a development process with specific stages that can be dealt with by resorting to different tools and techniques that, altogether, can provide the building blocks for an encompassing cybersecurity strategy.More than a strict blueprint, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE XX ABBREVIATIONS USED IN THIS MANUSCRIPT
the cyber kill chain is a guide that helps linking a layered defense strategy with the specific threat progression stages whose mitigation is envisioned.For this purpose, Table XIX depicts how the envisioned DiD strategy can be specified for SS, in alignment with specific kill chain stages.

XII. OPEN ISSUES AND RELEVANT GAPS
This work surveyed the most recent publications on intrusion detection and mitigation techniques in SS critical infrastructure, revealing multi-domain limitations which are still to be addressed and may inspire future research works: 1) Datasets: Datasets generated from testbeds or by synthetic means often lack complexity and realism, especially when compared with data collected from operational smart substations, potentially hampering research efforts.2) IDS Reliability and comparability: quite often, the accuracy and precision results for proposed cybersecurity techniques, which are deemed adequate for conventional use cases, are unacceptable for critical infrastructure scenarios targeting high availability.This is the case for power grids, where some operators boast 99.9999% availability figures [212] and where the cost of false positive or false negatives can be extremely high Finally, and to the best of our knowledge, there are no widely accepted IDS performance benchmarks, preventing a fair comparison of different techniques.3) Coping with real-time requirements: the discussion about the nature of protocols such as GOOSE and SV makes it evident to which extent determinism and realtime requirements are crucial for SS environments, not only limiting the scope of the protection mechanisms that can be deployed, but also revealing a gap regarding the availability of detection techniques capable of meeting tight timing requirements, while operating on resource-constrained environments.4) Insufficient focus on early-stage prevention: effective detection can only be achieved by focusing on early warning systems acting at the lowest stages of the cyber kill chain.However, this is only achievable by developing high-accuracy multi-domain holistic intrusion detection solutions, which are also important to cope with distributed threats not focused on single components.However, the majority of the existing work on detection for cybersecurity purposes in focused on field protocols used in SS domains for which any sort of automatic reaction mechanisms are mostly unfeasible (due to the nature of the power system process, which requires sub-ms response times), but also undesirable, as most operators tend to avoid any sort of automatic reaction mechanism for cybersecurity threats.5) Complex regulatory and standardisation landscape: the diversity of different cybersecurity standards and regulations that cover different aspects of the SS and SG landscape (sometimes even overlapping between each other) makes it difficult to enforce compliance in an effective way, prompting a debate on whether a more pragmatic approach should be followed.Equipment/solution providers, standardisation bodies, service operators and even national bodies should work more closely towards simplifying the regulatory frameworks, eliminating redundant and overlapping aspects, while defining clear boundaries for the scope of each standard.This makes more sense when considering that power grids have become more interconnected over the past years, creating further interdependencies, but also fostering cooperation between all involved actors.Besides these challenges there is also the need to acknowledge the medium term evolution perspective of the SS automation architecture, potentially evolving towards a service and equipment consolidation trend.As it happens with pretty much every radical evolution, these same developments may also affect the deployment and effectiveness of cybersecurity protection mechanisms, creating new challenges to be tackled.

XIII. CONCLUSION AND FUTURE RESEARCH DIRECTIONS
The need to cope with the challenges posed by a shift towards a bidirectional decentralized electrical power generation model has deepened the digitalization of the SG infrastructure, to which SS are not immune.However, this has also increased the infrastructure exposure to cyberattacks, with potentially catastrophic consequences.Considering the current situation, this paper provides an overview of the SS cybersecurity landscape, being focused on four domains: attack surface and threat model characterisation; intrusion detection strategies, testbeds and datasets; standardization frameworks; and design of a DiD strategy.
Overall, the analysis of the state-of-the-art regarding SS intrusion detection and defense mechanisms exposed various open issues which can inspire and guide future research works, also revealing a lack of benchmarks widely adopted by the research community, preventing a fair evaluation and comparison of intrusion detection systems.Also, it is not clear to which extent many intrusion detection mechanisms can really cope with the specific requirements of the SS domain, especially in terms of real-time detection capabilities, infrastructure coverage or implementation feasibility, Future research directions should (i) standardize the evaluation criteria and metrics for detection systems, (ii) create widely-accepted datasets, (iii) publish intrusion detection benchmarks to promote a fair comparison among different cyber detection approaches, (iv) explore alternative state-ofthe-art detection techniques, considering the strict timing and requirements of SS communications, as well as diversified data sources, to improve the detection performance, and (v) investigate distributed intrusion detection systems to cope with sophisticated attack strategies.

ABBREVIATIONS
The most relevant abbreviations used in this manuscript and corresponding definitions are listed in Table XX.

E
LECTRICAL power grids are at the very heart of Critical Infrastructures (CI), affecting all sixteen Manuscript received 16 April 2023; revised 8 July 2023; accepted 12 August 2023.Date of publication 15 August 2023; date of current version 22 November 2023.This work was supported in part by FEDER, in the context of the Competitiveness and Internationalization Operational Program (COMPETE 2020) of the Portugal 2020 framework, in the scope of Project Smart5Grid under Grant POCI-01-0247-FEDER-047226; in part by the "Agenda Mobilizadora Sines Nexus" Project under Grant 7113 through the Recovery and Resilience Plan (PRR) and the European Funds Next Generation EU, Component 5-Capitalization and Business Innovation-Mobilizing Agendas for Business Innovation under Grant 02/C05-i01/2022; and in part by the National Funds through the FCT-Foundation for Science and Technology, I.P., and the European Social Fund, through the Regional Operational Program Centro 2020, within the scope of the projects under Grant UIDB/05583/2020 and Grant CISUC UID/CEC/00326/2020.(Corresponding author: Tiago Cruz.)

Fig. 2 .
Fig. 2. IEC 61850 smart substation architecture, composed of process, bay, and station levels connected by station and process communication buses.

Fig. 5 .
Fig. 5. Phasor measurement unit.Power line current and voltages measurements are sampled, quantified, encoded and time-stamped according to the IEEE C37.188 standard protocol.

Figure 10
presents the five IEC 61850 application-domain communication profiles, as well as their underlying communication stack.Substation Configuration Language: IEC 61850-6 [62] Specifies an XML-based Substation Configuration Language

TABLE II SURVEYS
ON SMART GRID AND SMART SUBSTATION CYBERSECURITY PUBLISHED SINCE 2016

TABLE III COMPARISON
BETWEEN TRADITIONAL AND SMART GRIDS

TABLE IX COMPARISON
BETWEEN IEEE C37.118 AND IEC 61850

TABLE X MAJOR
CYBERATTACKS ON ELECTRICAL INFRASTRUCTURES

TABLE XII COMPARISON
OF INTRUSION DETECTION SYSTEMS' PROPERTIES FOR THE INTRA-SS DOMAIN

TABLE XIV COMPARISON
OF INTRUSION DETECTION SYSTEMS' PROPERTIES FOR PTP CLOCK PROTECTION

TABLE XIX SECURITY
MEASURES AND TOOLS IN DIFFERENT CYBER KILL CHAIN STAGES