Quantitative Resilience-Based Assessment Framework Using EAGLE-I Power Outage Data

Catastrophic impacts to power systems caused by extreme weather events have significantly increased during the last decade. These events highlight the need to develop approaches to assess the resilience of power systems against extreme events; however, the availability of data that capture power system performance during and after disruptive events is scarce. This paper proposes an assessment framework to evaluate the performance aspects of the power grid during extreme outage events using the Environment for Analysis of Geo-Located Energy Information (EAGLE-I) data. EAGLE-I data include information related to the number of impacted customers, the duration, and the location of power outages in the United States. Statistical analyses were conducted to extract resilience-based outage data and derive probability distribution functions of their impact and recovery characteristics. A list of extreme events is identified based on population-based threshold values. Metrics from other power outage assessments were used to measure the characteristics of each event, including the impact rate and duration, the recovery rate and duration, and the impact level. A probability distribution function is obtained for each metric. The proposed framework is conducted for each state across the United States. The obtained results provide a probabilistic representation of state-level outage behaviors, which can be applied as a framework to evaluate various resilience enhancement techniques.


I. INTRODUCTION
Modern society has grown to rely on electricity access and availability. When electricity is unavailable, individuals, communities, and countries are subject to economic and physical harm, especially when an electricity outage occurs during an extreme weather event (e.g., extreme heat or cold). Reliability has long been an important indicator for electric grid operators, but as the frequency and intensity of extreme weather events have increased in recent The associate editor coordinating the review of this manuscript and approving it for publication was Peter Palensky . years, yielding prolonged outages and significant economic losses [1], [2], resilience has become a larger focus for grid operators and customers. Also, aging infrastructure plays a vital role in increasing the prevalence and costs of extended outages [3], [4]. Planners at the facility, local, state, and federal levels are interested in resilience enhancement and evaluation solutions to reduce the likelihood and impact of power outages [5], [6].
Extended power outages are accompanied by noticeable socioeconomic impacts. Extreme outage events between 2003 and 2012 have resulted in damage costs between $22 billion and $41 billion dollars per year [7]. In the United States, prolonged power interruptions result in $43 billion to $62 billion dollars in economic losses per year [8]. Also, extreme weather events have counted for the majority of sustained power outages. For example, Hurricane Sandy caused more than 8 million customers to lose power across 15 states in the United States [7]. In 2021, Winter Storm Uri caused widespread power outages in Texas during extreme cold, which resulted in 246 recorded deaths and more than 4 million customers without power for a few days [9]. During the last 8 years, the United States has been exposed to 7 wildfires, 7 droughts, 73 severe storms, 19 tropical cyclones or hurricanes, 13 floods, 4 winter storms, and 1 freeze event, with more than $1 billion dollars in anticipated costs [10].
The fast and efficient restoration of power grid after disruptive events occur is one of the most important attributes to achieving resilient power supply. The quick recovery of the grid infrastructure reduces associated economic and community impacts [11], [12]. Also, decision makers have many competing demands for limited resources, meaning resilience investments are required to demonstrate ''significant and measurable short and long-term benefits that balance or exceed the costs'' [6]. Comparing the costs of installing, operating, and maintaining a resilience investment to its benefits, which include decreased outage costs, helps planners determine which resilience investments are worth implementing. Conducting cost-benefit analyses to validate resilience investments has become a requirement for many federal and state resilience grants [13]. These challenges require the development of resilience evaluation methodologies to quantify the impact of extreme events on power systems. Moreover, such methods are very beneficial to identify and conduct cost-benefit analyses of potential system upgrades and improvements.

A. RELATED WORK
Several definitions for power system resilience exist. In this paper, we use the following: ''The ability of a system to prepare for, absorb, adapt to, and recover from disruptive events'' [5]. Attributes of resilience include preparedness, recovery, adaptability, and reliability, to name a few. Electric reliability is the likelihood that electricity will be available during normal equipment failures, and grid operators have a long history of using reliability metrics. There have been no standardized metrics for resilience quantification [14]; however, to compare resilience across infrastructure domains and jurisdictions, there is a need for publicly available data sets with transparent metrics for, or attributes of, resilience [15]. Having reliable and accurate data is the first step toward understanding the behavior and performance of electric power systems during extreme events. These data sets can be used by (a) individuals and communities to perform cost-benefit analyses on resilience measures, such as backup power systems or islandable microgrids, and mitigation strategies, such as hardening transmission and distribution lines; and by (b) government entities to compare resilience performance across infrastructure systems. Also, data sets can be leveraged to extract system features and extreme event characteristics for resilience analyses. Therefore, robust statistical analysis can be carried out by using extreme event data and by quantifying their characteristics.
A few data sets have been prepared and reported that focus on extreme events and their associated impacts on power grids. Major outage event data are available from the U.S. Department of Energy (DOE) Office of Electricity (OE-417) [16]. These data report extreme outage events based on DOE's extreme thresholds [17]. DOE refers to major outages as those that impact at least 50,000 customers or cause unplanned power loss exceeding 300 MW. Also, the U.S. Energy Information Administration (EIA) provides the Annual Electric Power Industry Report, Form EIA-861 data [18]. EIA-861 contains statistics from 1990-2020, including utility outage, electricity usage, and number of customers. It also provides utility-reported reliability indices with and without major event days.
Though different approaches have been proposed to distinguish between outages that belong to reliability analysis and those that belong to resilience analysis, gaps still exist. For instance, a time-based threshold was used in [19] to identify prolonged outage events for the resilience evaluation framework. In [20], [21], and [22], a temporal perspective with a 24-hour mark was used as a threshold to differentiate between short-and long-duration outages. Also, a quantitative threshold has been used based on the amount of customers without power or the amount of lost energy to identify extreme outage events, as proposed by DOE [17]. In [23], a computational-based resilience interactions simulation platform has been developed to quantify resilience of transmission networks using utility outage statistics. Other approaches include assessing lifeline infrastructure restoration behavior using predefined extreme weather events [24], similar to a description of power outages using retrospective analyses (e.g., outages between 2000-2016) [17]. Most of these methods have conducted basic analysis of the existing data for specific weather events or defined geographic regions. The importance of extracting distribution functions governing the behavior of extreme outage events has not been deeply investigated, highlighting a research gap in resilience evaluation processes. Table 1 provides a summary of the current state-of-the-art in quantifying resilience using outage data and the associated challenges.

B. RESEARCH GAPS
Though DOE collects electric disturbances through the mandatory outage event (OE-417) [16] reports, these data are of insufficient quality to estimate outage duration for several reasons. First, customer outages are not always tracked and are sometimes incorrect. These data mainly rely on reporting outages using DOE's major event thresholds (more than 300 MW or 50,000 customers), which might not capture extreme events on small communities. These data do, VOLUME 11, 2023  however, provide insightful information on event start dates, end dates, and event locations for major outage events. On the other hand, EIA data do not provide sufficient information to determine the distribution and behavior of extended power outages. The EIA-861 reliability metrics-System Average Interruption Frequency Index (SAIFI), System Average Interruption Duration Index (SAIDI), and Customer Average Interruption Duration Index (CAIDI)-reveal basic system performance information that might not be very representative for resilience evaluation. In short, the existing data cannot be directly used to understand the characteristics of extreme power outages across country and state levels. Although there exist other sources of power outage data [25], the accessability of these data is very limited.
The diversity of events that can impact the grid makes it difficult to make generalized statements about what qualifies as a ''major'' event. A town of less than 1,000 people could be without power for weeks, yet it would not be considered a major event from the national scale, even though it would certainly have significant economic impacts on the town. In contrast, short-duration and high-frequency events that regularly impact a small number of people could make it difficult for communities to rely on power affecting their ''access'' to electricity, as is often the case with remote communities that have low quality of power supply. Therefore, a proper quantification framework is required to capture extreme outages taking into account the demographic population factors.

C. CONTRIBUTIONS
The goal of this paper is to examine a publicly available data set to evaluate extreme power outages and their characteristics. Our primary source of outage information is from the Environment for Analysis of Geo-Located Energy Information (EAGLE-I TM ) data set, provided by Oak Ridge National Laboratory (ORNL) [26]. The proposed assessment framework includes three main phases: filtration methodology, evaluation metrics, and probabilistic curve fitting. In the filtration phase, we use threshold values to identify extreme power outages. A population-based threshold is proposed to account for diverse extreme outages across different jurisdictions-The population-based threshold is defined to be a filtration threshold that relies on the population and the number of utility customers in a specific geographical location to refine extreme power outage events. The evaluation metric phase aims to measure characteristics of extreme outages, including outage duration, number of customers affected, and restoration time. The probabilistic curve fitting phase focuses on testing and providing probability distribution functions (PDFs) governing the behavior of the proposed evaluation metrics. Useful information from both OE-417 and EIA-861 data is used to justify and validate the efficiency of the proposed algorithm. Fig. 1 shows an illustrative framework of the proposed methodology.
The contributions of this paper are listed as follows: • Provide detailed statistics on extreme power outages on the national, state, and county levels.
• Develop a statistical methodology to quantify extreme electricity outages from the EAGLE-I data set considering the relative relationship between the outage level and the size of the area under study.
• Introduce metrics to measure and assess the behavior of extreme outage events from temporal and impacting levels.
• Create and evaluate proper distribution functions that govern the behavior of these metrics at the state level.

D. PAPER STRUCTURE
The remainder of the paper is organized as follows. Section II describes the EAGLE-I data and the required data processing techniques to overcome some of the associated false or missing information. Section III explains the proposed assessment framework to quantify extreme power outages through filtration, evaluation, and representation phases. Section IV illustrates the implementation procedure and presents results, and section V provides some concluding remarks.

II. EAGLE-I DATA
This study leverages EAGLE-I data to assess the behavior of extreme outage events at the state level of the United States. This section provides a description of the EAGLE-I data.
Also, it addresses data processing techniques to overcome some of the data quality challenges.
A. DATA DESCRIPTION EAGLE-I data are collected and managed by ORNL. This data set spans November 2014 to the present and is collected by scripted web scrapers that check utilities' publicly available outage maps to estimate the number of customers without power by utility in a given county. Outage records are updated every 15 minutes. This work considers outage records from November 2014-March 2021. Table 2 shows a sample of EAGLE-I data across a few selected counties of three states on July 1, 2015, at 10:30 AM. Note that this table includes a few selected data records for visualization. For instance, only three counties are shown for each state. Each record consists of temporal, geographic, and service provider information. This helps track the source of the outage and provides a deeper spatiotemporal analysis.
A few remarks about the EAGLE-I data should be noted. First, service providers or utilities usually have customers in different counties. For instance, the utility Entergy Louisiana has recorded outages in both Calcasieu and Jefferson counties in the State of Louisiana at the same time. The listed utility identification number can be used to assign outage events to specific service providers regardless the outage geographic locations. Though many counties have only a single utility, counties that span large geographic boundaries or counties that intersect metropolitan regions might have multiple utilities. EAGLE-I reports customer outages for each utility in each county. For example, Los Angeles County in the State of California is supplied by multiple utilities, including the Los Angeles Department of Water and Power and Southern California Edison. The provided Federal Information Processing Standards (FIPS) code provides a unique index to each county across the United States. This helps reduce the potential problem of repeated county names.
Because the data set is based on the number of utility websites that can be scraped, the number of utilities in the data set and their geographic granularity change over time. Also, the number of counties changes over time. Table 3 tracks the number of counties and utilities recorded during the studied period (November 2014-March 2021). This is also a result of the fact that more utilities are making their outages publicly available. As a result of this evolution in time, the raw data form cannot be directly used to estimate the yearover-year changes, and the average number of years must be based on the first records; however, the proposed assessment framework can be leveraged to provide insightful contributions to extreme power outages. Additionally, because of the EAGLE-I sampling methodology, statistical methods based on the assumption that increased samples will converge on the ''ground truth'' are not relevant as increased samples from one utility will not provide outage data for another utility in the same county; this limits the statistical work that can be done on this data to reduce uncertainty.

B. DATA PROCESSING
As with many large data sets, the raw form of the EAGLE-I data includes some data quality issues, including missing information and data discrepancies. To improve the quality of the data, it is required to apply different data transformation techniques, such as smoothing, aggregation, and normalization. Because this study focuses on state-level outage behavior, proper aggregation is required to group outages for each state. Fig. 2 shows the outage count for five counties in Alaska starting November 30, 2018, at 3:00 pm for 24 hours. The outage profile represents outage aggregation on the utility level for each county. It is obvious that one county is experiencing a sequential extreme outage behavior, whereas other counties have very low numbers of impacted customers. Outages occurring at the same time at different geographic locations might not be correlated unless a major extreme event is impacting several counties or nearby states. Therefore, it is not convenient to aggregate outages that occur at the same time. Also, it is noticeable that some data are not recorded at 15-minute intervals. For example, a 6-hour duration should have 24 data records; however, this is not the case for Matanuska-Susitna County, where only 15 records are observed during the event peak (18:00-23:59 on November 30, 2018).
Of eight counties in Alaska, only five have provided outage records. The existing eight counties are supplied by only four utilities, implying the existence of more than one utility in some counties. Having more than a sole service provider in a single county is normal within the EAGLE-I data. Because this work focuses on the state level, it is assumed that outages are aggregated geographically across the same county. VOLUME 11, 2023

III. ASSESSMENT FRAMEWORK
This section explains the proposed assessment framework to analyze the behavior of extreme outage events across the United States. The proposed framework consists of three main steps: first, filtering the preprocessed EAGLE-I data to extract extreme outages based on defined thresholds, then extracting the characteristics of the extreme outages using a few evaluation metrics, and, finally, conducting statistical analyses to compute PDFs representing each metric.

A. FILTRATION METHODOLOGY
Most power outages are classified as minor disturbances to the distribution system [27]. The main causes of minor disturbances include (a) technical causes, such as a short circuit, equipment failure, or system malfunction; (b) environmental causes, such as vegetation, animals, and inclement weather; and (c) human causes, such as physical accidents and human control errors [28]. On the other hand, major power outages are characterized by having prolonged outage durations and elevated outage impacts [1]. The primary cause of major power outages is extreme weather events, mainly including hurricanes, wildfires, and ice storms. Also, it is expected that vulnerabilities of power systems to cyberattacks will dramatically increase, yielding potentially catastrophic blackout events [4]. Caused by the diversity of power outage causes, it is difficult to produce representative outage behavior.
Various studies have been conducted to develop a criterion to distinguish between minor and major outages [20], [22]. DOE identifies extreme outage events as those that exceed 300 MW or 50,000 customers [17]. In [19], a 24-hour threshold was used to identify major events. The trade-off between a time-based threshold and an impactbased threshold imposes further challenges on identifying proper threshold values. Also, the correlation between the number of impacted customers and the relative geographic boundaries makes it difficult to generalize a specific threshold across the state level or the county level. For example, an event causing a power outage to a whole county will be preserved as a major event at the county level, but it might be ignored at the state or country level. Therefore, a proper threshold that accounts for both the temporal and impact behaviors should be carefully and efficiently computed.
The electric power outages in the EAGLE-I data set are represented by the number of customers without power, which varies from zero to a maximum value of almost 10 million customers during Hurricane Irma. A customer here is defined as any entity that purchases energy from a utility via a tariff. This means that ''customer'' should not be interpreted as ''persons''. Residential households with only one tariff often have multiple residents, and commercial entities sometimes have multiple tariffs for a single site.
In this work, we identify two thresholds, high (α h ) and low (α l ), to extract and evaluate extreme outage events. An extreme outage event is defined to be the set of contiguous customer outage records bounded by outage level beyond the threshold value starting and ending the event. First, α h threshold acts as a filtration phase to identify extreme outages. In other words, outages exceeding this threshold will be considered extreme outages. The chosen value of α h will have a significant impact on the analysis results. A higher threshold value results in fewer extreme events. Additionally, from an energy justice perspective, rural communities with fewer than 50,000 customers will never register as having a long-duration outage in the current methodology. Therefore, the value of α h will be selected to capture impacts on small communities as well as large communities. Upon determining the extreme events, it is required to extract their temporal characteristics. The α l threshold acts as a trigger phase to identify both the start and end time of a specific event. The value of α l is always lower than α h to provide a realistic representation of the event duration.  Because of the diverse outage events, other behaviors are noticed. Fig. 6 shows a special case where multiple event triggers can be identified within a single event. This figure shows the outage behavior in Macropia County in the State of Alabama. It is clear that using a single threshold might lead to having more one trigger value, specifically when data efficiency and accuracy are not granted. The zoomed view highlights the intermediate event triggers. Using the α l threshold solves this issue by ignoring all the intermediate triggers within the event start and end times. In other words, only the very first up-trigger and the very last down-trigger are considered across the event duration. Though data aggregation methods can be used to reduce the negative impacts of data inaccuracy, these methods require extensive analysis to determine the proper aggregation interval. This induces more challenges at the country level because a fixed aggregation interval might not be convenient for each county.

B. EVALUATION METRICS
The behavior of extreme events might vary from one event to another. Also, the associated negative impacts on the system performance rely on the event type and the geographic boundaries. The concept of resilience through a disturbance and impact resilience curve has highlighted the importance of assessing extreme power outages [29], [30]. A conceptual VOLUME 11, 2023 resilience curve was developed in [31] to define and quantify power system resilience. These studies have shown that the relative performance of a system to optimal and minimum performance level usually follows a triangular or trapezoidal behavior [1], [32]. These resilience curves provide temporal representations of system performance before, during, and after an extreme event. Various resilience metrics have been developed and quantified using these curves.
One power system performance indicator is the number of served customers at each time instant. This is a vital metric to compute reliability indices, such as SAIFI, SAIDI, and CAIDI [33]. The number of customers without power is the complementary value of the number of served customers for a specific utility, county, state, and country. The sequential outage records can be used to quantify the characteristics of extreme outage events. Following the same resilience metric conventions, a few metrics are defined to measure the behavior of extreme outages as follows: Leveraging the defined metrics, three additional metrics are proposed, as follows: 3) Recovery/impact ratio (R ri ): the ratio between the recovery and the impact duration; it can be calculated using R ri = T r T i Though some events do not follow linear impact or recovery rates, most events follow a monotonically increasing behavior before reaching the maximum impact level and a monotonically decreasing behavior after reaching the maximum impact level. Also, the linear rate function is adopted based on the triangular resilience curve described in [34]. This provides a simple, straightforward model to measure outage characteristics. A detailed piecewise linear function can be used to create a comprehensive mathematical model representing power outage behaviors. Fig. 7 projects the defined metrics on a single event. The impact of Hurricane Harvey on Jefferson County in Texas during August/September 2017 is used for clarification.

C. PROBABILISTIC CURVE FITTING
Reliability metrics usually consider power outages due to normal failures of power grid components. To distinguish between resilience-based and reliability-based metrics, only events exceeding the α h threshold are identified. In this work, the outage level is leveraged to quantify extreme events because the EAGLE-I data do not provide the customer-level specified outage duration. Because short-duration events have a higher probability to happen than long-duration events, averaging them will underestimate the significance of the resilience events. In fact, reliability studies usually consider the average of all events. In determining the likelihood and duration of power outages for resilience valuation, PDFs are created from historical data after discarding low-impact events.
The main goal of this work is to determine the probability distributions for the proposed event metrics based on extracted extreme outages from the EAGLE-I data set. These PDFs can be used to simulate diverse extreme outage events for resilience-based studies and to estimate the cost versus benefit of resilience mitigation strategies. For each state, a list of extreme outage events at the county level is obtained. The corresponding characteristics of these events are computed. Curve fitting approaches are used to evaluate the best fit PDF governing the behavior of each metric. More than 80 PDFs are tested, including normal, exponential, pareto, double weibull, t, gamma, lognormal, beta, and loggamma. The detailed list and information regarding each PDF can be found in [35] and [36]. The residual sum of squares criteria is used to evaluate the goodness of fit of each PDF.

IV. IMPLEMENTATION AND RESULTS
The methodology outlined in the previous section was applied to the EAGLE-I data set, which includes more than 130 million customer outage record values. This section provides detailed statistical analysis at the state level across the United States. The PDFs governing the behavior of extreme outage events were calculated and evaluated, and the results are detailed in this section.

A. EXTREME OUTAGE STATISTICS
The proposed method mainly depends on the chosen value of the filtration thresholds. A higher value of α h results in missing outages that can be classified as extreme events. Also, a higher α l will yield an inaccurate representation of the event characteristics, resulting in biased results. This case provides the sensitivity analysis between the threshold values and the number of extracted events.
To validate the efficiency of the proposed filtration method, the major outage event data (OE-417) [16] are used as a benchmark. The total number of reported events in OE-417 should be the same number of extracted events from the EAGLE-I data for the same duration after applying the DOE threshold. Our previous work in [19] provided extensive analysis on the OE-417 data. Based on the OE-417 data, 605 events took place from November 2014-March 2021, compared to 598 events extracted from the EAGLE-I data. This shows that the proposed filtration method is capable of extracting outage records corresponding to reported major events.
As mentioned before, the DOE threshold of identifying extreme events is convenient for highly populated regions, specifically in large cities and metropolitan areas; however, outages in rural communities with fewer than 50,000 customers will not be considered. Using a fixed threshold value has its own disparity because of the high variations in the geographic and population characteristics among different states and counties. In this work, we propose the dynamic threshold as a function of the county population level. The populationbased threshold is defined to be a filtration threshold that relies on the population and the number of utility customers in a specific geographical location to refine extreme power outage events. To compensate for the trade-off between the ''customer'' in the EAGLE-I data and the ''resident'' in the population statistics, a scaling factor of two is used. In other words, it is assumed that a single customer is equivalent to two residents because multiple residents might have a single tariff. The scaling factor of two is used based on available data from EIA-861. With a total of almost 158 million installed electric meters representing utility customers and a population of almost 330 million residents in the United States, a utility customer is equivalent to 2.07 residents. Accordingly, an outage exceeding 25% of subscribed customers is assumed to be an extreme outage. In short, values of α l and α h are selected to be 5% and 25% of subscribed customers, respectively. Table 4 shows a summary of the number of extracted events using the DOE threshold and dynamic populationbased threshold as values for α h at the state level. It is obvious that the population-based threshold provides a higher frequency of events because low-populated counties have been taken into consideration. Also, the value of the DOE threshold seems relatively high for many counties, resulting in extracting zero extreme events in some states. The population-based threshold provides a larger set of events for each state, which can be used to estimate PDFs governing event characteristics. It is worth noting that higher scaling factors of the population-based threshold results in lowering the value of the α h yielding increased number of extracted events.
Some important remarks include the dramatic increase in the number of extreme outages when using populationbased thresholds. Though some states recorded zero outage events using the DOE threshold, a noticeable number of events are captured using the population-based threshold. This implies the ability of the proposed population-based threshold to capture extreme outage events, specifically in regions with smaller rural communities. The top ten states in terms of the number of extreme outage events are ranked as follows: Texas, Virginia, West Virginia, Georgia, Kentucky, Louisiana, Oklahoma, North Carolina, Arkansas, and Kansas. Five of the ten states are also in the top ten for extreme weather events exceeding billion-dollar economic losses for the same study period [10], including Texas, Georgia, North Carolina, Virginia, and Oklahoma.

B. SINGLE STATE ANALYSIS
Once the filtration threshold is implemented on the preprocessed EAGLE-I data, extreme outage events are identified. This case shows the implementation of the proposed assessment framework on a single state. For clarification, Florida is selected, and the analysis is conduced using the populationbased threshold with a scaling factor of two. The following figures show the statistical analysis of the extracted events for each metric, highlighting the average and maximum values. Each figure represents a specific metric and reveals insightful information about the behavior of extreme outage events. Fig. 8 shows the Florida probabilistic histogram of the T e metric. On average, extreme outages last almost 36 hours, with a tendency to have a shorter duration. Though many events have a total duration less than 40 hours, a nonnegligible number of events last more than 100 hours. This aligns with the fact that Florida is one of the states most impacted by hurricanes. Less than 35% of the events have a total duration of 5 hours, which are usually events that barely exceed the cutting threshold value. Fig. 9 shows the Florida probabilistic histogram of the T i metric. This shows how long outages are impacting a specific region. The impact duration metric has an average of almost 9 hours. Forty percent of the extreme outages reach their maximum impact within 2 hours. A high maximum value is observed caused by the long impact duration of hurricanes that might last a few days. Fig. 10 shows the Florida probabilistic histogram of the T r metric. The recovery metric measures the capability of the power system to return to the pre-event status. Though the EAGLE-I data do not track the customer status at each interval, the value measured here can simplify the overall restoration behavior of a specific state. In Florida, extreme outages take, on average, 40 hours to restore power to almost   all customers. Though most events have a recovery duration of less than 35 hours, a noticeable number of events have a recovery duration of more than 50 hours. Fig. 11 shows the Florida probabilistic histogram of the O i metric. More than 70% of the extreme outage events impact less than 20,000 customers per county. Note that extreme weather-related events might affect multiple counties because of the diverse spatiotemporal propagation characteristics. Correlating geographic and temporal information about these events with EAGLE-I data will result in extensive analysis based on events; however, the lack of information imposes a challenging burden. The spontaneous maximum number of unserved customers during Hurricane Irma is 1.8 million, 1.4 million, and 1.1 million customers in Miami-Dade, Broward, and Palm Beach counties in Florida, respectively.     The values provided by this metric can used to quantify the resilience capability of a specific county or utility. More than 80% of the extracted events have an impact rate of less than 25,000 customers per hour. Fig. 13 shows the Florida probabilistic histogram of the R r metric. This measures how fast power is restored after an extreme event. The higher the recovery rate is, the more resilient the power system is. On average, 28,000 customers regained their service within 1 hour in Florida. Though this might seem like an acceptable recovery rate, it mainly depends on the number of customers who lost power. For instance, for a hurricane affecting millions of customers, it might take a few days to restore power to all who were impacted. Fig. 14 shows the Florida probabilistic histogram of the R ri metric. In general, the restoration (recovery) duration is longer than the impact duration; however, this metric provides a more concise evaluation of the relationship between the restoration and the impact at the state level. In Florida, the recovery duration is usually less than 28 times the impact duration. Also, this metric provides an evaluation indicator of the utility's performance within a specific county or state.

C. EVENT CHARACTERISTIC ANALYSIS
The methodology described in Section III-B was applied to the extracted events for each state. A population-based threshold with a scaling factor of three is adopted to create reasonable-size event sets for each state. Table 5 summarizes the mean and standard deviation values of all metrics for each state. This can be used to provide a comparison among all states with respect to extreme power outages. In general, values with high standard deviations show the large spectrum of diverse events; however, small variance values imply the closeness to an overall average value. This can be used to identify states with high power outage levels or prolonged outage durations.
Note that some states experience long-duration outage events, including Colorado, Connecticut, Idaho, Minnesota, Montana, Nebraska, Nevada, North Dakota, Oklahoma, and South Dakota, though many of these states have not recorded high numbers of extreme events. Also, the list of states exposed to outages with impacts exceeding 24 hours before the system starts to restore curtailed loads includes Alaska, Arizona, Arkansas, California, Colorado, Idaho, Iowa, Kansas, Michigan, Minnesota, Missouri, Montana, Nebraska, Nevada, North Dakota, Oklahoma, South Dakota, Tennessee, Texas, and Wyoming. Further investigation is required to trace the real causes of the impact durations. For instance, Nevada and California might exhibit long impact times due to high wildfire alert areas, which usually extend a few days to a couple of weeks. The recovery metric shows the list of states that usually suffer from extended restoration times that exceed 2 days on average, including Alaska, Colorado, Connecticut, Idaho, Minnesota, Montana, Nebraska, Nevada, North Dakota, Oklahoma, and South Dakota. VOLUME 11, 2023  The O i metric reveals the average value of the number of unserved customers per county per event. For example, there exists a maximum of 7,000 customers who lost power per extreme event in Alabama. It is clear that many states have average customer counts that are less than the 50,000 DOE threshold. This implies that the DOE major outage level cannot capture extreme outage events at the county level, yielding underestimated resilience values.
The R i and R r metrics are important factors to measure the system restoration capabilities. These metrics measure the average number of customers lost or restored power per hour, respectively. Florida, New Jersey, and Delaware are the top states with high impact rates, whereas Arizona, Florida, and New Jersey are the top states with high recovery rates. Because of the trade-off between the causes of extreme events and the number of impacted customers, further investigation is required to correlate these metrics to the resilience level of the system. For example, Florida is highly impacted by hurricanes, resulting in high impact rates, but it also shows fast recovery rates. The relative relationship between T i and T r is captured in the R ri metric. The smaller the R ri metric is, the better the system resilient is. For instance, Hawaii has a impactrecovery ratio of almost one, implying that it takes the system a relatively little amount of time to return to the pre-event status. In North Dakota, however, the restoration time is more than 130 times the impact time. The significantly high values is because of the existence of many events with impact durations less than 15 minutes. In other words, for an event with a recovery duration of 1 hour and an impact duration of 5 minutes (1/12 hour), the R ri value will be 12. To provide more realistic values, smaller recorded time steps would be required.
The last two rows of Table 5 provide the average values across the whole country. The weighted average is computed using the frequency of events provided in Table 4. These values can be used to rank and quantify the outage behavior in each state with respect to the country level. Also, such values provide average estimate values of extreme power outages across the United States. For instance, extreme outages usually last 76 hours, with impact rates of 6, 268 customers per hour, and recovery rates of 4, 105 customers per hour.

D. PDF MODELS FOR EACH STATE
The proposed assessment framework is applied to each state to determine the best fit PDF of each metric. A populationbased threshold with a scaling factor of two is adopted to create reasonably sized event sets for each state. Referring to Table 4, Delaware, the District of Columbia, and Hawaii have very limited numbers of events, resulting in less accurate PDF representations. The obtained results are provided in Table 6 and Table 7. In this study, we tested the goodness of fit of 84 PDFs to each metric using the residual sum of squares. The list of PDF names is provided in Table 8 in the appendix. Table 6 and Table 7 show the best PDF representing each metric in each state. The list of parameters governing each PDF is also provided. The index column (#) refers to the PDF index provided in Table 8. For example, in Alabama, the event duration metric (T e ) can be represented by the ''Fatiguelife'' distribution (index 68), with a location value of −0.17, a scale parameter of 11.21, and an argument parameter of 1.69, respectively. Following the same convention, one can easily allocate the PDF representing a specific metric in a selected state. The presence of different PDFs representing the same metric is noticed across all states. This is caused by the diverse and unique extracted outages for each state; however,

V. CONCLUSION
This paper has proposed a resilience assessment framework to evaluate the characteristics of extreme outage events at the state level in the United States. The proposed approach extracts extreme events based on recorded outages in the EAGLE-I data. An aggregation process is conducted to sum the outages occurring at the same time across different service providers at the county level. A population-based threshold is identified and used to filter abnormal outages. Various evaluation metrics are used to capture the characteristics of extreme outage events, including magnitude and temporal behaviors. Extensive statistical analyses are conducted to test and determine the best fit PDF model governing the behavior of each metric across all states. The results show the capability of the proposed assessment framework to extract extreme outages while accounting for diverse jurisdictional properties and sizes. The proposed framework provides a systematic statistical approach to understand the behavior of extreme weather impacts on the U.S. power grid based on real recorded outages across the nation. This also provides researchers with PDFs governing the behavior of extreme outage events for resilience-based studies.
Though the efficiency of the proposed framework has been validated, a few remarks need to be highlighted. First, the proposed method ignores the temporal changes in the number of counties, the county population levels, and the number of customer subscribers for each utility; however, it provides a single snapshot based on existing static values of these parameters. Also, the EAGLE-I data do not cover all counties, resulting in biased results for states that are missing accurate county information. Note that the proposed methodology overweights the evaluation metrics caused by significant amounts of non-reported data on the utility level and ignoring the frequency of occurrence of each event.
The frequency and duration of extended power outages are critical inputs in resilience planning and underpin any probabilistic approach to valuing the benefits of resilience investments. Therefore, this work contributes to quantitative resilience analysis; however, there is still significant additional work to be done. Future research can improve on the accuracy of the estimates and determine how outage durations vary by underlying causes. Second, while this analysis presents state level data, this work does not present a state-level inter comparison. Although one state has a higher average outage duration (or other metric), each state also has a different risk profile of extreme events. Comparing the outage profile of California wildfires and Florida hurricanes, for example, would be inappropriate; this work does not propose that any state is better or worse at resilience based on the analyzed metrics. Also, future power outages might have different characteristics than historical power outages, especially because of climate change. Additional work could model changes in underlying hazards to determine how outage frequencies and durations will vary over time. Further, a resilience-based outage curve can be developed to represent the temporal and magnitude behaviors of extreme outages based on specific weather-related causes. Extended power outage events are inevitable, but understanding their causes, probabilities, and impacts are important steps to reducing their likelihood and consequences.

ACKNOWLEDGMENT
Thanks to the EAGLE-I™ Team at the Oak Ridge National Laboratory for providing access to the power outage data and for the insightful comments. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paidup, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes. She is currently a Technical Project Lead with the Resources and Sustainability Group, Strategic Energy Analysis Center, National Renewable Energy Laboratory, Golden, CO, USA, leading disaster recovery, resilience, and sustainable development efforts. Her research interests include analysis and outreach to increase deployment of sustainable clean energy technologies, and best practices at all levels of government for sustainable and resilient infrastructure systems.
MOHAMMED BENIDRIS (Senior Member, IEEE) received the B.Sc. and M.Sc. degrees in electrical engineering from the University of Benghazi, Libya, and the Ph.D. degree in electrical engineering from Michigan State University.
He is currently an Associate Professor of electrical engineering with the University of Nevada, Reno (UNR). Prior to joining UNR, he worked as a Research Associate and a Visiting Lecturer at Michigan State University, an Assistant Lecturer at the University of Benghazi, and an Engineer at General Electric Company of Libya. He has more than five years of industry and consulting experience ranging from power plants control and operation to hardware design and installation, and total of more than ten years of academic experience. His research interests include power system modeling, analysis, reliability, stability, and resilience. VOLUME 11, 2023