A Study on the Association of Socioeconomic and Physical Cofactors Contributing to Power Restoration After Hurricane Maria

The electric power infrastructure in Puerto Rico suffered substantial damage as Hurricane Maria crossed the island on September 20, 2017. Despite significant efforts made by authorities, it took almost a year to achieve near-complete power recovery. In this study, we used spaceborne daily nighttime lights (NTL) imagery as a surrogate measure of power loss and restoration. We quantified the spatial and temporal extent of loss of electric power and trends in gradual recovery at the 889 county subdivisions for over eight months and computed days without service at the above tabulation areas. We formulated a Quasi-Poisson regression model to identify the association of the features from physical and socioeconomic domains with the power recovery effort. According to the model, the recovery time extended for areas closer to the landfall location of the hurricane, with every 50-kilometer increase in distance from the landfall corresponding to 30% fewer days without power (95% CI = 26% – 33%). Road connectivity was a major issue for the restoration effort, areas having a direct connection with hi-speed roads recovered more quickly with 7% fewer outage days (95% CI = 1% – 13%). The areas which were affected by moderate landslide needed 5.5% (95% CI = 1% – 10%), and high landslide areas needed 11.4% (95% CI = 2% – 21%) more days to recover. Financially disadvantaged areas suffered more from the extended outage. For every 10% increase in population below the poverty line, there was a 2% increase in recovery time (95% CI = 0.3% – 2.8%).


I. INTRODUCTION
On September 20th, 2017, Hurricane Maria passed through the island of Puerto Rico with winds exceeding 155 miles/hour [1]. This superstorm was the tenth-most intense Atlantic hurricane on record [2] and devastated all the major parts of the island resulting in serious damage to the electric and communication grid, food, water, transportation infrastructure, hospitals, and schools. More than 400,000 housing units were nearly destroyed, and from a total population of 3.2 million, 1.7 million lacked potable water, and 90% of the island was left without mobile or The associate editor coordinating the review of this manuscript and approving it for publication was Weimin Huang . internet service [3]. The Hurricane resulted in approximately 3000 deaths directly attributed to the immediate impact of the storm [4]. It was estimated that $95 billion would be needed to fully recover from these damages [5], [6]. After Hurricane Maria, the US Congress approved 20 billion in funding for the long-term rebuilding of Puerto Rico, however, much of this funding had not been disbursed until 2020. In 2021 some restrictions on the release of funds have been lifted [7].
The Hurricane compounded the economic debt crisis the island has been experiencing over the past decade. As a result, infrastructure owners and operators faced financial challenges and were often unable to upgrade the aging infrastructure. The persistent growth in unemployment has been resulting in large numbers of youth and working people leaving the island, resulting in an increasing proportion of elderly, as well as relatively higher rates of poverty among those remaining [6].
The electrical power failure was one of the major factors responsible for the loss of life and the slow pace of disaster recovery [8]. The power system in Puerto Rico operates within a challenging physical terrain and demands resilient infrastructures to withstand extreme storms. Yet, for a variety of reasons, the power infrastructure was unable to endure the extreme weather experienced during Maria [9]. Traditional fossil-fueled power generation infrastructure bounded by complex and interdependent elements [10], and disruption of one component in this complex network often leads to outages, making the entire system unsustainable. Most of the electric infrastructures on the main island of Puerto Rico were fueled by coal, gas, and petroleum plants with fuels imported by sea. Only 3 to 5% of total energy demand was met by locally generated renewable sources. Puerto Rico Electric Power Authority (PREPA) has its largest fossil fuel-based power generating plants are on the southern part of the island, where the population concentration is highest in the north. This makes the power infrastructure highly dependent on 30,000 miles of distribution and 2,400 miles of transmission lines [11], [12]. Hurricane Maria caused extensive damage to these power lines. The distribution system suffered enormously with 80% of damaged circuits. Most of the distribution poles collapsed as they were outdated and not designed to handle strong hurricanes like Maria [9]. About 50,000 utility poles were damaged and needed to be replaced due to the hurricane [13]. As a consequence, the archipelago experienced the longest and largest blackout in U.S. history and the second largest in the world on record [14]. Even during the recovery process, Puerto Rico experienced severe outages from time to time, due to power station blasts [15], failures of substations [16], main power line disruptions by the falling of trees [17], or caused by bulldozers [18]. In certain areas, it took eleven months to restore power to communities [19]. PREPA did not have adequate resources to handle this massive destruction of power infrastructure [20], [21].
Fossil fuel-based power generation not only made the archipelago rely on a complex electrical network that is more vulnerable to disasters but also made the cost of electricity twice as much as the US mainland [22]. With the fluctuation of international petroleum prices, the cost of electricity also fluctuates [12]. The authorities have realized the weakness of the power infrastructure pointed by the hurricane and are seeking alternatives. In 2019, PREPA issued a revised integrated resource plan [23], periodizing to establish mini and micro-grids, maximizing locally generated renewable power and increasing battery storage capacity [24]. In 2019, Puerto Rico also passed a law, Puerto Rico Energy Public Policy Act, to eliminate coal-based power generation by 2028 and 100% renewable power generation by 2050 [12].
Numerous studies have introduced methods for the prediction of power outages in advance of hurricane landfall. Liu et al. [25] leveraged the negative binomial model, a generalized linear model (GLM), and incorporated transformers and wind speed data to predict the level of outage before the hurricane landfall. This was one of the first published statistical models to estimate hurricane-induced power outages. Han et al. [26] improved the GLM model by adding variables including local climatology, geography, and power systems. A recent study also includes socioeconomic and demographic variables to predict outages with a linear regression model [27]. Some studies also levered machine learning algorithms, including Random forest (RF), Classification and regression tree (CART), Bayesian additive regression trees (BART), Multivariate adaptive regression splines (MARS) [28]- [33]. These studies focused on modeling the power outage, which is equivalent to modeling the damage of power infrastructure. On the other hand, a few studies aimed to model the outage duration, which is analogous to modeling the power infrastructure recovery after extreme events. Liu et al. [34] adopted both accelerated failure time (AFT) and Cox proportional hazard (CPH) regression techniques to predict outage duration times for ice storms and hurricanes. Nateghi et al. [35] depicted that the RF model can predict the outage duration with higher accuracy. They used physical factors including wind speed, wind duration, soil moisture, number of poles, switches, transformers, length of power lines, elevation, and landcover as the input variable in the model. Some studies argue that socioeconomic factors are also important determinants for outage duration [36]- [38]. A study looked into the power restoration in Florida after hurricane Irma and found the minority groups, population with disability, and unemployment rate have positive associations with power outage vulnerability with statistical significance [39].
In the case of hurricane Maria, without considering confounding factors from the physical domain, some reports claim that the socioeconomic inequalities drove the power restoration effort in the island after the hurricane, where the affluent communities recovered early and the poor suffered from lengthy outages [40]- [42]. However, the past studies we discussed above indicate that the duration of the hurricane-induced power outage significantly associates with the physical infrastructures of the affected area and the features describing the characteristics of the hurricane. To the best of our knowledge, no studies have been conducted to assess the effects of both physical and socioeconomic cofactors in the power restoration effort for hurricane Maria. In this study, with a multivariate statistical modeling approach, we investigated if the effects of the socioeconomic features (income and race) were more dominant in power recovery than the covariates from the physical domain, such as power lines, roads, elevation, and so on. We draw from a dataset consisting of remotely sensed nighttime light imagery (NTL) to better understand the spatial and temporal patterns of the recovery process in Puerto Rico after Hurricane Maria. Using NTL, we constructed time-series recovery graphs for all the county subdivisions, or barrios, of the island and computed the average days without power VOLUME 9, 2021 at the barrio level. We have observed the southeastern part of the island was highly affected by the hurricane. Lastly, we formularized a Quasi-Poisson regression model to identify the socioeconomic, and physical cofactors that influenced the post-hurricane power restoration effort.

II. NIGHTLIGHTS FROM SPACE
The era of nighttime light imagery started with Defense Meteorological Satellite Program (DMSP) dates to the mid-1960s [43], [44]. However, the digital archive of DMSP/OLS (Operational Linescan System) nighttime light data started in 1992 [45]. DMSP was followed by the Suomi NPP satellite with the Visible/Infrared Imager and Radiometer Suite (VIIRS) sensor in 2011, which overcome the shortcomings of DMSP [46]- [48]. One of the VIIRS sensors' bands is the day/night band (DNB), which detects light in a range of wavelengths from green to near-infrared. Using these bands, VIIRS can deliver high-quality information on nighttime lights even in low light conditions [49]. There is a rich body of knowledge in the application of nighttime lights (NTL) imagery for quantitative analysis and modeling of various phenomena on the earth. These include modeling of economic growth [50]- [57], studies on population change dynamics [58]- [62], power disruption analysis [47], [63]- [65], urban expansion estimation [66]- [70] and electricity consumption modeling [71], [72]. VIIRS nightlight data has also been used to estimate the power loss and recovery in Puerto Rico during Hurricane Maria. A previous study of the island-wide recovery rate using nighttime light was close to the official daily recovery reported by the power authority of Puerto Rico, with an average daily difference of 17.9% over the first six months of the recovery [41].
In this study, we used the most updated bidirectional reflectance distribution function (BRDF) corrected VIIRS nighttime light daily product (VNP46A2) [73] to model the post-Hurricane Maria power infrastructure recovery in Puerto Rico. Night light radiance for before and after Hurricane Maria is shown in Fig. 1(a,b). Fig. 1(c) shows the percent loss in night light radiance at the 500-meter resolution pixel level. It is clear from the map that the eastern part of the island is highly affected by the hurricane. Fig. 2 shows the histogram of power loss of each month with horizontal bars. Each bar is divided into 40 bins, where each bin represents a 2.5% loss of power. The color code represents the areas of the island that fell into each bin every month. After the landfall of Maria on 20th September 2017, a significant portion of the island lost between 70 and 90 percent (island-wide median = 78.4%) of the power infrastructure capacity. The figure also shows the power recovery of the island over time. The red-concentrated areas in the bars, which represent the large land areas, shift towards the left each month, reflecting the overall recovery of the island. In March'18, according to nightlight pixel values, 76% of the island fully recovered, with island-wide 5% power loss.

III. QUANTIFYING RESILIENCE AND DAYS WITHOUT POWER
The concept of resilience has been formulated in various disciplines, such as earthquake engineering research, where significant work has been done defining metrics for resiliency. Accordingly, a given system's resilience is defined as a function of how well it can withstand a shock and how quickly it recovers, if it experiences any level of failure [74], [75]. Typically, the recovery gradually increases until it reaches a final stable state, usually at or near the original performance levels. Fig. 3 illustrates this concept using changes in percentage values of daily the nighttime light due to hurricane Maria for San Anton barrio of Ponce, where the green line represents the resilience graph. The area above the resilience curve represents the loss of performance of the power system over time, where the area under the curve indicates the available capacity on a given day. We developed an interactive web platform to visualize the resilience graphs for all the barrios across the island. This can be found with this link: https://shams08.github.io/NightlightMaria/.
To capture the power loss and recovery for each of the 889 barrios, we spatially aggregated the 500-meter resolution daily nighttime light data at the barrio level. To construct the resilience graph, a stable baseline radiance value for each barrio before the hurricane needed to be identified, that is done by taking the daily average nighttime light radiance for the month before hurricane Maria (August 2017), which was considered as 100% power capacity for the given barrio. To capture the power capacity drop and recovery, we used daily nighttime data from 20th September 2017 to 20th May 2018. However, due to cloud cover, daily data is not always available for each barrio. To estimate the values of missing dates and to eliminate the white noise from the daily time-series observations, we used a 10-degree polynomial on the observed values to capture the overall trend of the recovery, then we used linear interpolation of the trendline to estimate the missing observations. Following the methods of other studies [41], we computed the days without power for each barrio using the average  Fig. 4(a) shows average days without power for each barrio along with hurricane path and landfall location. It is clear from the map that the southwestern side of the island took more time to recover than the eastern side. Fig. 4(b) and Fig. 4(d) show the spatial distribution and histogram of the median household income at the barrio level. The median annual household income of the island is just over 20,000 USD, which is less than a third of the median income in the US (68,703 USD). The map shows the western part of the island is more poverty-stricken than the eastern side. With statistical model, we investigated the role of poverty in the power restoration effort in the following sections.

IV. COFACTORS FOR POWER RECOVERY
To identify the association of the socioeconomic and physical factors in power recovery, we looked at population density, wind speed, the transportation infrastructure, the location and intensity of landslides, power infrastructure, and the location of business institutions.
Population density, the percentage of the white population, and those living under poverty at the barrio level were incorporated with data from the American Community Survey [76]. To include the magnitude of destruction of the hurricane in the model, we included two variables. First, we computed the percent power loss right after the hurricane using nighttime lights.
Here, b = Mean monthly NTL radiance in August 2017. a = NTL radiance on the first available data after hurricane (between September 21st and 27th).  Secondly, we computed the linear distance from the centroid of each barrio to the hurricane landfall location of the island and use it as a surrogate measure for the wind speed. Generally, the land where the hurricane hit first receives the maximum wind strength of the hurricane. Later, the hurricane weakens when it moves across the land [77]. We located the hurricane landfall track by observing the hurricane path shown in Fig. 4(a).
Road infrastructure is built on complex networks, as every road segment can have distinct attributes regarding speed limit, the number of lanes, traffic signal, and others. According to the Federal Highway Administration (FHWA), roads can be classified into four major types, such as interstate, arterials, collector, and local roads [78]. The local roads strongly correlate with the population density, where the arterial road includes freeways and highways, and collector roads connect local roads with the arterial roads [79], represents the external connectivity of the barrios. In regression models, it is often not possible to incorporate all the attributes of the road network. In this regard, in many studies, road density is used as the surrogate for accessibility [80], [81]. Road density is also considered an important factor in disaster recovery studies [82], [83]. To include road accessibility in the power restoration model, we considered two variables. First, we computed the road density for each barrio, which we considered as a measurement for local accessibility. Here we collected road data from Humanitarian Open Street Map Team [84]. Secondly, we created a dichotomous variable to capture the accessibility of the barrios with the highspeed roads. In the model, this variable would examine how the inter-barrio or regional accessibility impacted the power restoration. We found, 121 out of 889 barrios do not intersect arterial and collector roads (Fig. 5). Here we used arterial and collector road shapefile from the dataset of Highway Performance Monitoring System (HPMS) of FHWA [79]. The dummy variable can be expressed with the following notation.
Road dummy = {0 = barrio not having arterial and collector roads or 1 = barrio having arterial and collector roads}.
To include elevation in the model, we used Shuttle Radar Topography Mission (SRTM) 90-meter resolution digital elevation data [85] and spatially aggregate it at the barrio level to compute average elevation for each barrio and use it as a cofactor in the regression model. Reports show that Hurricane Maria triggered landslides in many parts of the island [86]. To find the effect of landslides in the power restoration effort, we collected the landslide intensity data from USGS [87]. The landslide data divided the island with 4 square km grid cells which were classified as either containing no landslides, fewer than 25 landslides/ square km, or more than 25 landslides/ square km. We aggregated the landslide data spatially at the barrio level, then created a categorical variable for the barrios with three classes: none, moderate landslide, high landslide.
Similar to the road network, the electrical power system is a complex network built on interconnection and interdependence among different components, including generators, power lines, transformers, substations, and others [88]. To find the association of power systems in hurricane-induced outages, many studies used power lines density as a cofactor in their power outage model [26], [27], [29], [35]. Here, we collected the power transmission line, distribution line, and substation shapefile from the GIS portal of the Government of Puerto Rico website [89]. We spatially aggregated the components of power systems to compute their density at the barrio level. Finally, we collected the location of hotels in Puerto Rico from the Open Street Map (OSM) database [90] and created a dummy variable, reflecting if a barrio has hotels or not, to identify if there were any priority business institutions in the restoration effort.

V. MODELING APPROACH
Poisson regression is a popular choice for modeling count data. However, it comes with a strict assumption that the expected value (mean) is equal to the variance, which cannot be satisfied in many real-world scenarios. Over-dispersion in count data occurs when the conditional variance exceeds the conditional mean [91]. With our data, in the Poisson model, we found the residual deviance was much higher than the residual degree of freedom, which proves the over-dispersion in the data and demands for alternate modeling approaches. In this study, we used the Quasi-Poisson Regression model, a generalization of the Poisson regression, that is effective for modeling the over-dispersed count data [92]- [95]. The Quasi-Poisson model assumes that the variance is a linear function of the mean, which can be written as: where, Var(Y) is the variance of Y, and E(Y) is the expected value and θ is the dispersion parameter. The algebraic form of the model equation for the Quasi-Poisson regression is the same as that for Poisson regression, where the log of the outcome variable is predicted with a linear combination of the cofactors: log(E(Y)) = β 0 + β 1 X 1 + β 2 X 2 + · · · · · · + β n X n (4) where, Y is the target variable: days without power, X 1 . . . .X n are the explanatory variables, β 0 is the intercept, β 1 . . . . β n are the regression coefficients that we wanted to estimate. We implemented the model in R with the GLM package [96].
To understand the relation between the cofactors we performed the bivariate correlation analysis with the Pearson correlation coefficient that can be determined with the following equation.
r (x,y) = Cov(X, Y)/σ x σ y (5) where r x,y is the Pearson coefficient, Cov(X,Y) is the covariance of X and Y. σ x and σ y is the standard deviation of X and Y respectively. The value of r x,y can range from −1 to +1. The strength of a positive relationship is considered 'strong' if the absolute value of the coefficient is over 0.70, 'moderate' if it falls between 0.40 and 0.69, and 'weak' if the value is between 0.1 and 0.39 [97], [98].

VI. RESULTS AND DISCUSSION
As noted above, we used the Quasi-Poisson model to identify the influential factors in the power recovery in Puerto Rico after Hurricane Maria. Model parameters are shown in table 1. Here, we reported the Outage Rate Ratio (ORR), which is the exponentiated regression coefficient estimates (β i ) from the quasi-Poisson model.
The sign and the magnitude of the regression coefficient resemble the degree of association of different variables with the recovery. The positive sign of the regression coefficient (Coef) indicates that the variable influences to slow the recovery process. That means the days without power increased with the increase of the value of the cofactor. The p-value and Z-score for every coefficient are also included in Table 1 to show the statistical significance of the covariates. To be considered a statistically significant association, the P-value needs to be less than 0.05, and the absolute value of the Z-score should be at least 1.96. We also reported the 95% confidence interval of the ORR in Table 1. We found the residual deviance of the model is significantly lower than null deviance, meaning the covariates substantially improve the model performance. Again, the residual deviance is larger than the residual degree of freedom, pointing out the overdispersion in the data, which is controlled by the dispersion parameter θ. To avoid the effect of multicollinearity in the model, we only included the variables that have low correlation coefficient values. In this regard, the variables like population density, road density are not included as a cofactor in the model since they strongly correlate with power distribution line density. However, we explained the effects of these variables in the model interpretation with correlation coefficient values. The correlation matrix of the cofactors and the residual plots of the model are included in the appendix.
Hurricane Maria made its landfall from the southeastern part, causing the maximum damage to the power infrastructure ( Fig. 1(b)). Also, that part of the island experienced long outages (Fig. 4(a)). According to the model, and as expected, the distance from the landfall was a significant factor for recovery because the land where the hurricane hits first is most likely to experience severe damage. So, the areas around the landfall suffered from maximum damage across VOLUME 9, 2021 all the physical infrastructures like power, communication, and others. The roads closer to the landfall had a higher propensity to be destroyed, making it difficult for the service crews to reach the affected areas resulting in a long recovery time. According to the model parameters, the barrios located further away from the landfall experienced faster recovery. Considering all variables as constant, every 50-kilometer increase in distance from the landfall corresponded to 30% (95% CI = 26% -33%) fewer days without power. On the other hand, the covariate '%power loss' has a positive association with the outage, indicating the area that lost more percent power, or had more relative power infrastructure damage, needed more time to recover. Every 10% increase in percent power loss corresponds to 7.2% (95% CI = 5.4% -9.7%) increase of days without power.
According to the model parameters, the density of the power distribution line was one of the major factors for the recovery. Following other studies [26], [27], [29], here we used distribution line density to investigate how the power infrastructures of the island influenced the restoration process. With bivariate correlation coefficient analysis, we found the power distribution line density is strongly correlated with the population density (r = 0.90), substation density (r = 0.55), and 38KV transmission line density (r = 0.47). Fig. 6 shows the powerlines and substations concentrated in the population centers. According to the model, holding other variables as constant, if we increase 50% distribution line density, the days of the outage decrease by 1.07% (95% CI = 0.1% -2.0%). This indicates that the recovery was relatively fast in places with high density of electrical equipment as well as population. In other words, the power authority prioritized the high dense areas for early recovery. We also found at the barrio level that the electric power line density strongly correlates with road density (r = 0.81). Since outage duration is negatively associated with the distribution line density, we can also say barrios with higher road density received early recovery. High road density offers more options for the power crews to reach the affected area even if the primary route of the destination is closed by debris or landslide.
According to the model, the variability of road connectivity and elevation were significant determinants for recovery.
To capture the regional connectivity of barrios, we incorporated the arterial and collector into the road dummy variable. The negative coefficient indicates that areas having a direct connection with the arterial or collector roads have a higher propensity for early recovery with a reduction of outage days by 7% (95% CI = 1% -13%). This indicates the service crews went to the barrios early that were easily accessible through the high-speed road network. Conversely, land elevation is positively associated with the number of days without power. The coastal regions of Puerto Rico have low elevation, where the central part of the island is composed of mountainous and hilly terrain. The model depicts that, holding all variables as constant, a 50 percent increase in elevation increases the outage days by 1.2% (95% CI = 0.34% -2.0%). Again, many parts of the island experienced major landslides as an aftermath of Hurricane Maria. The recovery effort was greatly hampered by the landslides, as they caused road damages and obstruction with debris ( Fig. 1(d)). Recovery period extended by 5.5% (95% CI = 1% -10%) for the barrios affected by moderate landslide (fewer than 25 landslides/square km) and 11.4% (95% CI = 2% -21%) for the barrios suffered from high intensity landslide (more than 25 landslides /square km). Finally, we observed from the model that poverty has a statistically significant positive association with the days without power. Holding other variables as constant, every 10% increase in people living under poverty increased 2% (95% CI = 0.3% -2.8%) outage days. The model did not find statistical significance of racial influence in the power recovery efforts.
Although socioeconomic factors like poverty had statistical significance, we cannot claim it as the only major driving factor for the recovery. The physical factors like communication, elevation, the magnitude of damage, and others played a non-trivial role to determine the number of days communities went without power after Hurricane Maria.

VII. CONCLUSION
Hurricane Maria, the category four hurricane, devastated the power infrastructure of the island of Puerto Rico. We used remotely sensed nighttime light satellite imagery to quantify the power loss and recovery of the island. Firstly, we spatially aggregated the pixels of daily NTL imagery at the county subdivision (barrio) level, constructed the time-series nighttime light recovery graph for each barrio, and computed the average days without power. We found the hurricane caused about 78.4% reduction of lights across the entire island, where the southeastern part experienced a longer outage. We formulated a Quasi-Poisson regression model to identify the association of features on socioeconomic and physical domains in the power restoration effort of the island. The model suggests that the places closer to the hurricane landfall location took a long time to recover. The hurricane hits the island with its maximum strength at the point of landfall, which corresponds to massive destruction across all physical infrastructures, resulting in a longer recovery period. We also found the recovery took more time for places that experienced more damage  by the hurricane. The model depicts that the accessibility of an area was a significant factor for the recovery. Barrios, which are well connected with arterials and collector roads, recovered early, where the mountainous regions experienced longer days without power. Many mountainous places were affected by the landslide after the hurricane, which blocked or damaged the roads making it difficult for the service crews to reach the affected areas. The model also showed that poverty-stricken neighborhoods suffered from extensive outages. However, the magnitude of the influence of poverty on power restoration cannot overshadow the association of other physical factors in the recovery process. We did not find any association of race or ethnicity in the restoration effort.

APPENDIX
The linear model suffers from multicollinearity when two or more cofactors in a multivariate regression model are strongly correlated. Multicollinearity causes a large variance of the estimators and produces unreliable model parameters.
To eliminate multicollinearity in the model, we excluded highly correlated regressors. Pearson's correlation coefficient of the covariates is shown in Table 2. Fig. 7 shows the residual plot of the Quasi-Poisson model. The mean of the residuals is zero, with no observable trend, which indicates randomness. The Q-Q plot (Fig. 7) also shows the residuals of the model follow a normal distribution. Both of these plots indicate that the model effectively captured the variability in the data.