Inference on Long-Range Temporal Correlations in Human EEG Data

Detrended Fluctuation Analysis (DFA) is a statistical estimation algorithm used to assess long-range temporal dependence in neural time series. The algorithm produces a single number, the DFA exponent, that reflects the strength of long-range temporal correlations in the data. No methods have been developed to generate confidence intervals for the DFA exponent for a single time series segment. Thus, we present a statistical measure of uncertainty for the DFA exponent in electroencephalographic (EEG) data via application of a moving-block bootstrap (MBB). We tested the effect of three data characteristics on the DFA exponent: (1) time series length, (2) the presence of artifacts, and (3) the presence of discontinuities. We found that signal lengths of ∼5 minutes produced stable measurements of the DFA exponent and that the presence of artifacts positively biased DFA exponent distributions. In comparison, the impact of discontinuities was small, even those associated with artifact removal. We show that it is possible to combine a moving block bootstrap with DFA to obtain an accurate estimate of the DFA exponent as well as its associated confidence intervals in both simulated data and human EEG data. We applied the proposed method to human EEG data to (1) calculate a time-varying estimate of long-range temporal dependence during a sleep-wake cycle of a healthy infant and (2) compare pre- and post-treatment EEG data within individual subjects with pediatric epilepsy. Our proposed method enables dynamic tracking of the DFA exponent across the entire recording period and permits within-subject comparisons, expanding the utility of the DFA algorithm by providing a measure of certainty and formal tests of statistical significance for the estimation of long-range temporal dependence in neural data.

T HE strength of long-range temporal correlations in time series can be estimated by Detrended Fluctuation Analysis (DFA), a statistical method based on scaled windowed variance [1]. Although the technique originated to assess patterns in DNA structure [2], the utility of DFA has expanded to probe long-range temporal structure in other physical systems such as daily temperature fluctuations [3], gait stride intervals [4], and heart rate variability [5]. Since the discovery of long-range temporal correlations in recordings from the human brain [6], neurophysiologists and clinicians have used DFA to analyze the long-range temporal structure in neural data associated with sensorimotor function [7], [8], cognitive development [9]- [11], neurologic disease [12]- [14], brain stimulation [15], [16], and brain trauma [17]- [19].
DFA is one of several algorithms that statistically estimates the Hurst exponent (H), a scalar value that reflects the level of self-similarity in time series [20]- [23]. The measure directly relates to the rate of decay of the autocorrelation function (Supplementary Figure 1) [24], [25]. When DFA is applied to a segment of time series data, it returns an estimate of the Hurst exponent, but it does not provide a statistical inference for that estimated Hurst exponent. Thus, statistical comparisons are done on distributions of DFA exponent values from groups of subjects. Without a quantification of the uncertainty in the estimate of H, within-subject comparisons (across conditions) and the assessment of statistically significant changes in the DFA exponent over time are not possible. In this study, we present a statistically rigorous method to estimate the confidence interval for the DFA exponent of human EEG data. This technique enables us to track long-term EEG temporal structure changes in an infant and to compare treatment responses in subjects diagnosed with epilepsy.
Our method of statistical inference for DFA applied to neural data is based on the moving block bootstrap (MBB) [26], [27]. Bootstrapping and other Monte Carlo methods generate independently sampled data with similar statistics as the original signal. The MBB is a bootstrapping-based procedure that is applicable to signals with temporal autocorrelation (such as neural time series data) [28]. Using MBB, we generate distributions of estimates of the Hurst exponent by randomly sampling from blocks of a time series (rather than singleton points, as in a classical bootstrapping technique), concatenating the blocks, and computing the test statistic over the new time series [27]. This preserves the dependence structure of the time series [26]. Although MBB has been used to generate empirical distributions of DFA exponents in simulated and financial returns time series [29], the challenge of bootstrapping EEG data for analysis with DFA has not been addressed. The extension to brain signals, in particular EEG, is not trivial because of the nonstationary and nonlinear nature of the signals, the continuous transition of brain states, and the presence of artifacts in the data [30]- [33]. We discuss practical considerations for implementation such as the minimum signal length required and the effect of introducing discontinuities through bootstrapping and artifact rejection. We then show how these methods can be applied to EEG data to assess changes in brain state and measure response to treatment in subjects diagnosed with epilepsy.

A. Simulated Data Generation
We used simulated data to show performance of the DFA algorithm under various conditions that are common in neural time series processing and MBB. Simulated fractional Brownian motion (fBm) was created by integrating fractional Gaussian noise (fGn) that was generated with a given Hurst exponent (H) [34]. We used the contributed MATLAB function "ffgn" [35] to create a temporally-correlated signal with H ranging from 0.50 to 0.99. The function generated exact paths of fractional Gaussian noise by means of circulant embedding for positively correlated signals (0.5 < H < 1.0) [36]. The number of data points in the generated signal corresponded to the number of samples in twenty minutes of EEG data at a sampling rate of 200 Hz to approximately match standard clinical data collected from human subjects.

B. EEG Recording and Pre-Processing
We retrospectively analyzed three human EEG datasets from infant subjects enrolled in two different studies at the Children's Hospital of Orange County (CHOC) to assess the utility of MBB and DFA. Both of these studies investigated a seizure type known as infantile spasms. These studies were approved by the CHOC Institutional Review Board.
In the first dataset, a board-certified pediatric epileptologist (DS) retrospectively identified 20-30 minute scalp EEG recordings from 21 normal infants aged 4-19 months (median 6.3 months). All 21 recordings were ultimately classified as normal. This dataset was used to demonstrate the methods and parameter choices for MBB-DFA (see Sections IID, IIE, IIIB, and IIIC). We will refer to this dataset as the Control dataset.
The second dataset was comprised of a long-term (>24 hour) video-EEG recording of a 7-month old otherwise normal infant who was having events suspected to be seizures. This recording was also ultimately classified as normal by the same epileptologist (DS). We selected a data segment that included 126 minutes of wakefulness and 166 minutes of sleep, without regard to specific sleep stages. (See Sections IIF and IIIE). We will refer to this dataset as the Long-Term dataset.
Lastly, the third dataset consisted of recordings from two infants who were diagnosed with infantile spasms. These subjects were chosen to demonstrate changes in the DFA exponent distributions after treatment. Subject A was a 5.5-month-old infant whose epilepsy was treated successfully with adrenocorticotropic hormone (ACTH), a common therapy for the disease. We selected 24.3 and 21.2 minutes of awake data from the pretreatment and post-treatment recordings of this subject, respectively. The clipped segments were chosen because they were relatively free of artifacts. Successful treatment was defined as a cessation of seizures and resolution of an epileptiform EEG pattern known as hypsarrhythmia [37], [38]. Subject B was a 3.7-month-old infant who was treated unsuccessfully with vigabatrin, exhibiting persistent spasms and hypsarrhythmia after treatment initiation. This subject's pre-and post-treatment EEG included 20.4 and 23.4 minutes of awake data, respectively. We will call this two-subject dataset the IS Subject dataset.
All recordings consisted of nineteen scalp electrodes placed according to the international 10-20 system, sampled at 200 Hz with impedances below 5 kΩ. Artifacts were visually identified by DS in the broadband bandpass-filtered data (1.0-40 Hz). Start and end times of high-amplitude muscle activity, eye blinks, subject movement, poor electrode contact, impedance checks, and photic stimulation were marked. We added a buffer on either side of the visually marked artifactual time periods to account for any signal spread due to the filtering done prior to removal. EEG data were re-referenced to a linked-ear montage and divided into narrow frequency bands using FIR filters for the delta (1-4 Hz, order = 400), theta (4-7 Hz, order = 100), alpha (8)(9)(10)(11)(12) Hz, order = 50), and beta (14-30 Hz, order = 29) frequency bands (Supplementary Figure 2). After the bandpass filter was applied, the time periods marked as artifact were removed from all channels and the remaining clean data segments were concatenated. This procedure was followed whenever artifacts were removed from the data. Note that for some analyses (Section II. E and F), some or all of the artifacts were purposely left in the data.

C. Detrended Fluctuation Analysis
Detrended Fluctuation Analysis was applied to both simulated and human scalp EEG data. We repeat the algorithm here for completeness; all steps are as originally described in Peng et The magnitude of the Hilbert transform is representing the amplitude envelope. We definex to be the average 1 where k is the number of samples in the time series. The signal profile at time t is the integrated zero-mean envelope Note that in the simulated data, the mean is subtracted from the fractional Gaussian noise and then integrated over time.
The signal profile {u(t)} is divided into equally-sized windows with 50% overlap, which we term "boxes" [39] (Fig. 1A). We define the q th box of size M to be the signal profile at time points t q + 1, t q + 2, . . . t q + M . In this box, u q can be represented by a linear model u q (t) = β q 0 + β q 1 t where t = t q + 1, . . . t q + M , where the estimates of the regression parameters denoted byβ q 0 andβ q 1 are obtained by the ordinary least squares criterion. The estimated trend is removed to form the residuals for Note thatR q theoretically equals zero because the interceptβ q 0 is included in the linear model. For boxes of size M , we denote the total number of boxes to be Q M . The median of the standard deviations across all boxes q = 1, . . . , Q M is computed to be This process is performed for twenty logarithmically-spaced (base 10) box sizes M = 1 second (200 samples) to M = ( 1 10 signal length). The slope of the scatterplot for log{SD M } as a function of log(M ) is estimated using least squares (Supplementary Figure 3). This sample slope is the DFA exponent, (a) We compared Control EEG data (n = 21) with all artifacts removed (α f u ll c le a n ) to data with only a subset of the artifact time periods removed. We randomly selected 10% of artifact time periods to be removed from the data for twenty iterations and recorded the median value for each channel. This was repeated for an increasing percentage of artifact time periods in 10% increments. The presence of artifacts increases the estimate of α and biases the distribution. (b) We compared continuous data during sleep (α f u ll ) to data containing discontinuities due to the removal of segments of data. The segment properties matched the temporal properties of artifacts in awake EEG. The removal of segments of data did not bias the distribution and minimally increased the variance. Representative data is shown from the theta frequency band in both panels.
denoted α, and it serves as a direct estimate of the Hurst exponent (H).

D. Model Selection to Ensure Linearity of DFA Plots
Because DFA will return an exponent value even when the resultant scatterplot is nonlinear, we ensured both simulated and human EEG data was best described by a linear model using the model selection technique outlined in [40].
We tested four models with parameters θ n : Linear: θ 1 e θ 2 x + θ 3 Using the Bayesian information criterion (BIC), we determined which model best fit the empirical data.
We tested three sets of data. First, we simulated data with H = 0.7 and a signal length of 240000 samples. We generated 1000 independent realizations of these data and reported the percentage of realizations that best fit the given model. Second, we repeated the test with H = 0.95 to ensure similar performance with stronger long-range temporal correlations. Third, we tested our Control dataset (n = 21 subjects, minimum of 250 seconds of data during wakefulness). We performed this test in all 19 channels and in all four frequency bands (except for one channel that was removed due to artifact), resulting in 1592 observations.

E. The Effect of Signal Length
To estimate the variation in the DFA exponent α when analyzing time series of different lengths, we examined the difference between α measured for a short window of data and α measured for the entire signal length. We simulated time series that were at least 2.4 × 10 5 samples in total, the equivalent of twenty minutes of data sampled at 200 Hz. We performed DFA on the entire signal length and labeled this value as the "true" DFA exponent (α). The signal was then divided into non-overlapping windows of one minute of data (1.2 × 10 4 samples), and the DFA exponent was calculated for each one-minute segment (α). We recorded the difference between α andα. This process was repeated for windows increasing in size by one minute up to ten minutes.
To demonstrate this effect in human EEG data, we analyzed the sleeping portions of the Control dataset before artifact removal. The artifacts were left in the data to minimize the number of discontinuities in the signal. The analysis included patients from this dataset whose sleeping portions of data exceeded 14 minutes in length (n = 10 subjects). The iterative process described for simulated data was repeated for these data. The DFA exponent calculated for the entire EEG time series was labeled α f ull and the exponents for the shorter segments (α) were compared with the α f ull value.

F. The Effect of Artifacts
The detection and removal of artifacts is a common and often necessary practice in human EEG data analysis. We directly tested the effect of the presence of artifacts and the effect of artifact removal on the estimate of H. To investigate how artifacts affect α, we first bandpass-filtered data during wakefulness, removed all artifacts, and measured the DFA exponent for each electrode in the Control dataset (α f ull clean ). We compared this value to the DFA exponents measured in bandpass-filtered data with a percentage of artifact time periods removed randomly (α). We chose different subsets of artifacts for twenty iterations, resulting in a distribution of α values for each percentage of artifacts removed. This process was repeated in increments of 10% of artifact time periods removed in all awake portions of the 21 subjects in all channels.
The removal of artifacts creates discontinuities in the time series and this alters the temporal structure of the signal. We tested how removing small portions of data affects the estimate of H. This question was also investigated by [41], but the study did not quantify the effect of such discontinuities on the variance of the distribution of DFA exponents obtained, and their tests were only performed on simulated data. To investigate this in human EEG data, we removed epochs from sleep EEG data with the same temporal properties as artifacts, namely the artifact duration and inter-artifact duration. We performed this analysis on sleep EEG that was minimally contaminated by artifacts. We first calculated artifact and inter-artifact duration from awake EEG in the Control dataset and then assigned simulated "artifact" times to the sleep EEG signal. The number of simulated artifacts was chosen according to a linear model of the number of artifacts per dataset length. We calculated the DFA exponent for the unaltered signal (α f ull ), and the DFA exponent for the time series with discontinuities due to a percentage of the simulated artifact time periods removed (α).

G. The Effect of Discontinuities Due to Bootstrapping
Employing a moving block bootstrap will introduce discontinuities into the data because the process concatenates randomly sampled blocks of time series data (Fig. 1B, B B ) [26], [42], [43]. We wanted to quantify the effect of this type of discontinuity on the estimate of H using DFA. To do this, we first generated simulated time series as before, effectively twenty minutes in length. The DFA exponent was calculated for this original signal (α). Discontinuities were introduced by randomly selecting one 50-second block of data from the time series, removing it from its original location, and adding it to the end of the time series, creating two discontinuities with one window translation. We chose a block length of 50 seconds to match the bootstrapping block length (B B , Fig. 1B). This process was repeated for ten iterations of block translations, creating a new time series with twenty discontinuities. We measured the DFA exponent (α) for the new time series with each iteration and recorded the difference between α andα.
We also tested the Control EEG dataset. We recorded the difference between the DFA exponent measured for the original time series (α f ull ) and the exponent measured with each window translation (α). We analyzed patients from this dataset with at least 500 seconds of continuous EEG during wakefulness (n = 13 subjects).

H. Bootstrap Analysis
We employed MBB-DFA by concatenating randomly selected blocks of data from the time series [26], [27], [42]. This method preserves the dependence structure in the time series by concatenating blocks of sufficient length to maintain correlation on a shorter time scale [43]. To build a bootstrap distribution in simulated data, we randomly selected ten blocks, each 50-seconds (1.0 × 10 4 samples) in length (B B , Fig. 1B), from the 20-minute signal and concatenated the blocks to create a new time series that was 500 seconds long. DFA was performed on the new time series and the exponent (α) was calculated for 500 realizations. We then used MBB to obtain bootstrap distributions for the Long-Term and IS Subject datasets. All recordings in these datasets exceeded 1000 seconds in length (at least twenty 50-second windows).
The DFA exponent and its confidence intervals were tracked over time in the Long-Term dataset. The time series was segmented into 79 overlapping "global" windows that were 1000 seconds in length with 80% overlap (B G , Fig. 1B, shift = 200 seconds after artifact removal). Each global window of data was individually bootstrapped to calculate the distribution of DFA exponents (α) for that window. In each global window, we randomly extracted ten 50-second blocks of data and concatenated the blocks to make a new 500-second signal as described above. We analyzed this signal with DFA, recorded α, and repeated this process to create a distribution of 500 values. The empirical distributions were obtained for all 79 global windows.

A. Model Selection in DFA
We first confirmed that the resultant scatterplot from DFA was best fit with a linear model, indicating that the amplitude modulations of the time series can accurately be described by power-law scaling. We observed that the two cases of simulated data (H = 0.7 and H = 0.95) resulted in nearly identical preferences toward the given models, with the linear model being chosen ∼78% of the time (Supplementary Table 1). This indicates that the value of H does not alter the preferred model. For Control EEG, almost 60% of the data was best described by the linear model, with the next best model being preferred less than 20% of the time (Supplementary Table 1). However, of the 40% of cases that did not prefer the linear model, the BIC values for the chosen model and the linear model were less than 3% different in 75% of cases. Thus, we concluded from the model selection procedure that the linear model was appropriate for our EEG data and we used a linear model exclusively in our analyses.

B. Variance of DFA Exponents in Simulated Data
We first measured the variation in the DFA algorithm output by generating simulated data with a known H and recording the estimate of H, represented by the DFA exponent (α) over 500 realizations. The mean of the distribution of α values was very similar to the true exponent H used to simulate the data (Supplementary Figure 4). The maximum standard deviation of the distributions was 0.024. Thus, the proposed estimator is approximately unbiased for the true exponent H. Histograms of selected distributions showed little deviation from normality (Supplementary Figure 4).

C. The Variance of the DFA Estimate Decreases for Longer Signal Lengths
There is greater uncertainty in DFA exponent estimation in shorter data segments because the smaller box sizes contain a Fig. 4. The effect of bootstrapping discontinuities on α. (a) We compared the DFA exponent measured for simulated data (α) to data with an increasing number of windows translated (α), creating discontinuities in the data similar to those created during the bootstrapping process. The variance of the distribution increases slightly as more discontinuities are created. (b) In the Control EEG dataset (n = 13 subjects), we measured differences between α values for EEG time series with translated windows compared to the DFA exponent measured using the original continuous EEG signal (α f u ll ). The median and interquartile range of the distribution slightly increased as a function of the number of introduced discontinuities in the time series. Representative data is shown from the theta frequency band.
smaller number of data points. We tested this effect using simulated data with H ranging from 0.50 to 0.99. We recorded the difference between the α for a selected segment of the time series andα, and we aggregated the results from 50 independent realizations of simulated data ( Fig. 2A). The distributions indicate no trend in the median α value as a function of segment length, but the variance of the distribution is higher for shorter data segments than longer segments.
The Control EEG data show bias in α, with the largest effect in the lower frequency bands (Fig. 2B). Similar to the simulation results ( Fig. 2A), the variance decreases with longer signal lengths. Though the EEG data interquartile range is only slightly larger than the simulated data, there are more outliers in the human EEG data. This is most likely due to the presence of artifacts.

D. Artifacts Increase the Variance of the DFA estimate
Data containing artifacts had higher α values than data in which artifacts had been removed (α f ull clean ) (Fig. 3A). For each permutation in which random artifact time periods were removed, the median α value is shown in Fig. 3A (see Supplementary Figure 5 for all values). In contrast, the introduction of discontinuities with the same temporal properties as artifacts ( Supplementary Figures 6 and 7) (to simulate artifact removal) did not bias the α distributions and had minimal effect on the variance of the distribution (Fig. 3B, full figure is shown in Supplementary Figure 8). The range of the α − α f ull clean distribution due to the presence of artifacts (Fig. 3A) is nearly an order of magnitude larger than the range of the α − α f ull distribution (Fig. 3B), indicating that it is better to remove artifacts from data prior to applying DFA, even if it introduces discontinuities.

E. Discontinuities Due to Bootstrapping do Not Significantly Increase DFA Exponent Variance
Bootstrapping and the removal of artifacts will introduce discontinuities into the time series. We examined the effect of such discontinuities on the variance of the DFA exponent. In simulated data with varying numbers of added discontinuities, we found no trend in the median α −α value for any value of H (Fig. 4A). Here α represents the DFA exponent with added discontinuities andα is the DFA exponent of the signal with no discontinuities. The standard deviation of the distribution, however, increased with the number of added discontinuities. This increase in standard deviation was greater for more positively correlated signals (Supplementary Figure 9).
In Control EEG data, the addition of discontinuities by translating 50-second windows of data caused slight increases in the median α value and the variance of the distribution, and this trend was present in all frequency bands (Fig. 4B, Supplementary Figure 10). However, we note that the variance of α in EEG data (Supplementary Figure 10) is only slightly bigger than the variance of α in simulated data (Supplementary Figure 9), indicating that these variations are on the same order as noise fluctuations. This slight bias may also be present because we used awake EEG data which contained artifacts.

F. Bootstrapped Distributions of DFA Exponents in Simulated Data
Applying MBB to time series generates independently sampled realizations of the time series (correlated within a time series; independent across different time series realizations) upon which distributions of a test statistic can be derived. We generated simulated data, block-bootstrapped the signal, and measured α for 500 iterations. We tested simulated data with values of H ranging from 0.50 to 0.95, in increments of 0.05, using twenty realizations for each value of H (Supplementary Figure 11). The median values of the bootstrapped α distributions for individual realizations varied from one another despite being generated with the same Hurst exponent (Supplementary Figure 11). However, the variation in the median was comparable to the variation measured for independent simulated data realizations (Supplementary Figure 6), so this variation was expected. When the DFA exponent deviations (α −α) were aggregated over all twenty realizations of noise for all H

G. DFA Bootstrapping in Long-Term EEG Data
We used MBB and DFA to measure the statistical significance of changes in α in the Long-Term dataset (Fig. 5). This EEG dataset contained transitions in brain state over 292.1 minutes: the subject was awake at the beginning of the recording, fell asleep at minute 56, woke up at minute 79, and then fell asleep again at minute 149 for the rest of the recording. We found significant differences between α during wakefulness and sleep in all frequency bands and in all channels except for channel C3 in the alpha band (Wilcoxon rank-sum test, corrected via Benjamini-Hochberg procedure [44], adj. p < 0.05).

H. DFA Bootstrapping for Within-Subject Comparison of Pre-and Post-Treatment EEG Data
Lastly, in the IS Subject dataset, we compared α before treatment and after treatment in the delta, theta, and alpha frequency bands (Fig. 6). For Subject A (blue), in all frequency bands and all channels, there was a statistically significant increase in the strength of correlations after successful treatment (Wilcoxon rank-sum, corrected via Benjamini-Hochberg procedure, adj. p < 0.05). For Subject B (red), the distributions were significantly different in all channels except P3 in the delta band, O2 in the theta band, and P4, O1, and T3 in the alpha band (Wilcoxon rank-sum, corrected via Benjamini-Hochberg procedure, adj. p < 0.05). Results for all channels are shown in Supplementary  Figure 13.

IV. DISCUSSION
When applying DFA to neural data, one major limitation is that there is no framework to examine the variance of the Hurst exponent estimate. Our proposed approach integrates the moving block bootstrap with DFA to enable the calculation of confidence intervals for the Hurst exponent estimate in neural data. We showed that the use of short data segments increases the variability of the DFA exponent (Fig. 2), suggesting the use of longer segments of data whenever possible. We found that the presence of artifacts positively biases α (Fig. 3A). This effect is successfully mitigated by removal of artifactual segments, even though this introduces additional discontinuities (Fig. 3B). We also found that the introduction of discontinuities due to the bootstrap process does not increase the variance more than what can be expected from noise fluctuations ( Fig. 4B and Supplementary Figure 4). This allows techniques like DFA to be successfully applied to human neural data. When analyzing EEG data, MBB-DFA enabled calculation of a time-varying measurement of α, and we found significant differences between wake and sleep (Fig. 5). For application to a clinical setting, we show that our method can be used to compare pre-treatment and post-treatment EEG data in subjects with infantile spasms (Fig. 6). This demonstrates the potential impact of our approach, as single-subject comparisons were not possible with previously available methods.
A. Practical Considerations for Implementation of MBB and DFA on Neural Data 1) Selection of DFA Window Size: Although there are many ways to measure temporal dependence in neural time series [45]- [50], we chose to use DFA to estimate the Hurst parameter due to its widespread application in current neuroscience literature [6], [9], [13], [18], [19]. The method was originally developed to accurately estimate the Hurst exponent despite nonstationarities in the data, supporting its use as the preferred method to analyze temporal structure [2], [39], [51]. However, some groups have heavily criticized DFA in this regard in recent years, showing circumstances where certain nonstationarities and parameter selection greatly influenced the accuracy of the measure [52], [53]. For example, [53] showed that the resultant DFA plot will become nonlinear if the minimum window size is too small. Likewise, if window sizes are too long, the resultant plot may have "cross-over" points, requiring special analysis techniques [41], [54]. Additionally, certain nonstationarities such as periodicity and trends have been shown to reduce the accuracy of the algorithm [55], [56].
To address the issue of appropriately-sized windows for DFA (M, Fig. 1A), we chose our smallest window size to be 1 second (200 samples). This exceeds the window size in which nonlinearity in the DFA plot occurs [53]. We set our largest window size to 1/10 of the signal length [39]. Earlier comparison with the autocorrelation function of the amplitude envelope in these data showed that correlations were significant up to 100 seconds [57]. Our maximum window size was on the order of 120 seconds for both the simulated data and the longest control datasets.
2) Time Series Length: We showed that the variance of the Hurst exponent estimate was a function of the length of the time series being analyzed. While some DFA studies analyze segments of data that are only several seconds long [18], [58], we showed that the most reliable estimates ofα are achieved for segments of data that were several minutes long (Fig. 2). This is not surprising since longer time segments have more time points and hence will result in lower variances in the Hurst exponent estimator. We note that estimates of H using short time segments are not necessarily incorrect; they may simply be describing the temporal structure of the signal more locally and are more subject to variation. However, because the variance in the measure is greater with shorter segments of data, more subjects or trials may be necessary to demonstrate a significant result. These results also highlight the specific value of the MBB method. For example, we can compare the distribution of DFA exponents acquired by simply cutting the time series into smaller segments ( Fig. 2A) to the distribution of DFA exponents resulting from MBB (Supplementary Figure 11B). The bootstrapped values were closer to the value for the original signal than those measured for shorter segments, showing that MBB provided a better estimate of the Hurst exponent without sacrificing statistical strength.
We found that the variation in α −α stabilized with segments of EEG data five to eight minutes long, defining a minimum length to ensure the best attainable estimate of the Hurst exponent. This result informed our decision to use global windows (B G ) of 1000 seconds; because employing a block-bootstrap involved calculating α for 500-second segments (8.33 minutes) of block shuffled data, this choice of global window size ensured that the bootstrapped data length still exceeded the minimum data length of five to eight minutes.
3) Parameters for Moving Block Bootstrap: We chose a length of 50 seconds for B B because a sufficient binomial coefficient was needed to perform 500 iterations of random block sampling. In the simulations, we randomly sampled ten 50-second blocks of data from the 1200 seconds of data (24 blocks, each 50 seconds long). In the long-term datasets, we chose ten 50-second blocks of data from the 1000 seconds of data in the global window B G (20 blocks, each 50 seconds long). These parameters resulted in binomial coefficients of 1961256 and 184756, respectively, exceeding the coefficients needed for 500 iterations of sampling ( n C k = n ! k !(n −k )! ). We note that the differences between DFA exponent distributions generated for 1200-second datasets and 1000-second datasets were negligible (data not shown). We also note that including ten 50-second blocks means that the largest box size included in the DFA analysis is 50 seconds. This is necessary because the long-range temporal structure beyond this box size has been destroyed. 4) Discontinuities and Artifacts: Artifact removal improves confidence in the DFA estimate because it removes data of nonneural origin, but the process introduces discontinuities into the data. We showed that the presence of artifacts positively biased the median value of the DFA exponent distribution (Fig. 3A), while discontinuities due to artifact removal had minimal impact on the estimate (Fig. 3B). This substantiates the practice of removing artifacts prior to any DFA analysis. However, ensuring enough data remains for robust estimates of the DFA exponent is imperative. Alternative artifact rejection strategies like Independent Components Analysis may provide similar results while reducing the number of discontinuities in the time series [15].
Lastly, we showed that discontinuities in the data due to bootstrapping do not increase the DFA exponent variance more than that seen from pure noise fluctuation (Fig. 4B, Supplementary  Figure 4). Previous studies showed that the DFA exponent is heavily affected by the introduction of discontinuities if the signals are anti-correlated (0 < H < 0.5) [41]. However, our results coincided with their findings that the exponent estimation remains largely unchanged in the presence of discontinuities in positively-correlated signals [41]. In previous literature, neural signals have unanimously been described as positively correlated [6], [39].

B. Applications of MBB and DFA
We showed that MBB and DFA can be used to track changes in the temporal structure of the EEG in a healthy infant during a sleep-wake cycle (Fig. 5). The time-varying distributions of DFA exponents showed the largest differences between wakefulness and sleep in the delta frequency band. The results shown here are related to the work in [59], in which time-varying DFA exponents are measured via Kalman filtering and covariance calculation.
We additionally showed that these techniques could be used clinically to compare pre-and post-treatment data in two subjects with infantile spasms (Fig. 6). The two subjects were chosen as examples to show the utility of the MBB-DFA method. In one subject, the change in α is large, while the other shows little change in α after treatment. Our previous work suggests that the larger increase in α in the responding subject may be due to changes in the neuronal network associated with treatment response. In that study, we showed that the DFA exponent is significantly lower in infantile spasms patients compared to healthy controls, and the value normalizes with successful treatment [14]. However, this effect on DFA exponent distributions calculated with MBB will need to be validated on a larger dataset.
Other groups have used DFA in clinical applications to investigate EEG changes when a patient experiences a stroke [18] and for neonatal background differentiation [19]. In an ECoG study in epilepsy patients, long-range temporal correlations measured with DFA were stronger when the electrodes were near the epileptogenic zone [12]. These studies show that DFA can provide useful information in analysis of group-wise statistics.
However, DFA has not often been used to describe withinsubject changes. Although some groups have investigated properties of the DFA algorithm to assess local changes in the time series [60], these studies focus on assessing changes in temporal structure at the smallest time scales possible [19], [61]. In contrast, our study focused on filtering out local variations in the temporal structure to make conclusions about lasting, global changes in the strength of correlations in the time series, enabling tracking of slow changes in brain state or in response to treatment.
A moving-block bootstrap technique has been implemented alongside DFA before, but has not yet been applied to human neural data [29]. The variation of the DFA exponents in bootstrapped distributions of human data bore a strong resemblance to our results for simulated data. We hypothesize this is due to the long window sizes being analyzed: though we know brain activity is changing on a millisecond time scale, the average correlation strength is consistent when analyzed over hundreds of seconds. We hypothesize this to be crucial in the analysis of brain state changes, such as wake and sleep staging, as well as comparing pathological and healthy activity.
Assessing changes in the Hurst exponent over time has the potential for significant impact in the fields of cognition, brain stimulation, and medicine. This impact can be broadened further with the natural extension to a multivariate moving block bootstrap, enabling analysis of spatially-varying DFA exponents with confidence intervals. These methods can be broadly applied to longitudinal neurophysiological data, shedding light on cognitive processes and progression of mental and neurological diseases and their treatments. More specifically, when used as a measurement of the disease state in patients with epilepsy, an increase in the DFA exponent for a single patient over time can function as an objective biomarker of treatment response [14]. This could guide clinical practice, inform seizure medication selection, and ultimately lead to more effective and expeditious treatment. Similarly, the assessment of DFA exponents from single subjects over time could describe disease progression in Alzheimer's [62] or depression [63]. The ability to examine the statistical significance of within-subject differences allows for an expanded view of DFA and its use in human neural data.