Intrinsic Entropy: A Novel Adaptive Method for Measuring the Instantaneous Complexity of Time Series

The determination of appropriate parameters and an appropriate window size in most entropy-based measurements of time-series complexity is a challenging problem. Inappropriate settings can lead to the loss of intrinsic information within a time series. Therefore, two parameter-free methods, namely the intrinsic entropy (IE) and ensemble IE (eIE) methods, are proposed in this paper. The eIE method requires two parameters, which can be easily determined through an orthogonality test. The proposed approaches can measure instantaneous complexity; thus, they do not require a predetermined window size. White noise and three other varieties of colored noise were used to test the stability of the proposed methods, and five types of synthetic signals and logistic maps were applied for measuring instantaneous complexity and regularity. The results revealed that the IE and eIE methods exhibit satisfactory stability. Both methods provide point-by-point entropy measures for time series. The eIE method is useful for measuring the complexity of frequency and amplitude modulation. Furthermore, the periodicity of time series can be detected using the two proposed methods.

entropy is highest when all states are equally likely. Different types of entropy-including Kolmogorov, Rényi, and minimum entropy-have been derived from the concept of Shannon entropy for system complexity measurement.
The concept of entropy is also applied for measuring timeseries irregularity. The approximate entropy [2] and sample entropy [3] are two popular methods derived from Kolmogorov entropy and applied in many fields for measuring time-series irregularity. However, they are sensitive to time-series length and parameter settings and associated with high computation time costs. The permutation entropy (PE) [4] and dispersion entropy (DE) [5] are alternative methods for measuring timeseries irregularity. They depend on the occurrence probabilities of different time-series patterns and are derived on the basis of Shannon entropy. However, some parameters (i.e., the embedded dimension m, time delay d, and class number c) must be selected when using them to measure time-series irregularity.
The concept of Shannon entropy is also applicable to the power spectrum after the total power is normalized to 1, and the entropy method applicable to the power spectrum is called the spectral entropy (SpeE) [6]. The power distribution can be derived from a Fourier spectrum to obtain the power in each frequency band [7]. No parameter needs to be selected in the procedure for calculating SpeE. The empirical mode decomposition (EMD) energy entropy (EMDeE) is another entropy method based on the power distribution [8]. The power distribution of EMDeE is derived from the modes decomposed through EMD. The power of a raw time series is the sum of the power of all modes. According to the definition of EMD, theoretically, no parameter must be selected.
In the aforementioned and most of entropy methods, the time window's size must be set for entropy calculation. Intrinsic information might be lost if the time window is too long or short. Another limitation of these methods is that the complexity of each point within the time window is unknown. Therefore, in this letter, two methods are proposed: the intrinsic entropy (IE) and ensemble IE (eIE) methods. Similar to the SpeE and EMDeE methods, in the IE and eIE methods, the concept of Shannon entropy is applied to the power distribution of time series instead of the occurrence probabilities of time-series patterns. Moreover, in the IE and eIE methods, the parameter setting can be completed easily. The time window's size need not be selected in these methods, which can measure the complexity of each point in a time series. Different types of noise and synthetic signals were used in the present study to test the proposed methods and compare them with other methods.

II. INTRINSIC ENTROPY
Consider a univariate signal x(t) and its analytic signal X(t), which is expressed as follows: where i represents the imaginary part of the signal, H(.) is the Hilbert transform, a(t) is the instantaneous amplitude, exp(i ∫ ω(t)dt) is a term from Euler's formula, and ω(t) is the instantaneous frequency. Therefore, x(t) can be represented as the real component of X(t) as follows: The steps in the IE method are as follows: 1) A well-known adaptive filtering method-namely EMD [9], which is used in the Hilbert-Huang transform-is employed to decompose x(t) into intrinsic mode functions (IMFs). In accordance with the definition of an IMF [9], EMD is terminated when the residual becomes a monotone function. Therefore, no parameter is theoretically required for decomposition, and the relation between x(t) and IMFs is as follows: where c k (t) is the kth IMF, r(t) is the residual, and n is the number of decomposed IMFs. Each IMF can be represented as follows: 2) The parameter a k (t) is extracted using the FM-AM decomposition algorithm [10], which involves the following steps: i) Initialize the iteration count as follows: j = 1. ii) Interpolate the upper envelope e j (t) of |c i (t)|. iii) Check the following if condition: e j (t) , let j = j + 1, and perform step ii again. v) Otherwise, output a k (t) as follows: vi) End 3) According to the aforementioned procedure, ∀t, a k (t) must be positive. The term a k (t) can be considered the power of the IMF c k (t) at time t. The following matrix is formulated: [a 1 (t), a 2 (t) · · · a n (t)] T . Each column of this matrix can be considered the power distribution at time t. Similar to the SpeE [7] and EMDeE methods [8], the proportion of the kth power in the distribution can be calculated as follows in the IE method: Therefore, according to (1), the IE at time t can be calculated as follows: The IE is a time series instead of a single value.

III. ENSEMBLE IE
In EMD, a signal dependent on local extrema is decomposed. Therefore, if a change occurs between a local extremum and its neighboring local extrema in a time series, EMD cannot be used to decompose the signal into IMFs. This phenomenon is called the mode-mixing problem. When a signal changes slightly, ensemble EMD (EEMD) [11] and complementary EEMD (CEEMD) [12] are useful methods for signal decomposition. In EEMD, white noise is added to the signal before EMD is performed to increase the number of local extrema. To weaken the effect of white noise, EEMD is conducted numerous times; thus, many IMF sets are extracted. The relative IMFs in all sets are averaged to obtain ensemble IMFs. Similarly, in CEEMD, white noise is added but also subtracted for further weakening the effect of white noise. We used CEEMD to test the IE method. In the eIE method, CEEMD is applied instead of EMD to obtain ensemble IMFs for calculating the entropy. Two parameters are required for CEEMD: the number of times CEEMD should be performed and density or standard deviation of the white noise. The purpose of the first parameter is to weaken the effect of white noise [11], [12]. Because IMFs should be orthogonal, whether the standard deviation of white noise affects the orthogonality of ensemble IMFs must be determined. The following orthogonality index (OI) is useful for determining the aforementioned two parameters: Fig. 1(a) illustrates the variation in OI with ensemble number. The OI does not change dramatically in the stable range of the ensemble number. For the determination of other parameters, an OI closer to 0 indicates greater orthogonality. Fig. 1(b) illustrates the variation in OI with the standard deviation of white noise. According to our experience, 50 executions of CEEMD and a standard deviation of white noise between 5% and 20% of the standard deviation of the input signal are acceptable for most signals.

IV. EXPERIMENT
Four types of noise-1/f 0 , 1/f 1 , 1/f 2 , and 1/f −1 noise, which are also known as white, pink, Brown, and blue noise, respectivelywere used in the experiments. To test the effect of signal length, each noise signal was modulated between the lengths of 40 and 1600 points in intervals of 40 points. In each entropy method, each window's entropy was calculated for increasing signal length. In the IE and eIE, each point of the signal had a unique entropy value. For further testing, the mean entropy of each window was calculated, and 100 simulations were performed for each type of noise. The obtained results were the means and standard deviations of the entropy for each window length (left side of Fig. 2). The IE and eIE methods were not sensitive to signal length. In addition, in the IE and eIE methods, pink noise had the highest entropy, white and Brown noise had similar entropy, and blue noise had the lowest entropy. In general, white noise has high entropy at short time scales because of its irregularity [3], [4], [5]. However, pink noise is more complex than is white noise [13], and this complexity is observed at long time scales [14], [15], [16]. The performance of the IE and eIE methods was similar to that of multiscale entropy methods at a long time scale. This finding indicates that the IE and eIE methods may primarily capture complexity. The right side of Fig. 2 illustrates the distribution of entropy time series in a test with a signal length of 1600 points. The entropy distribution in the time series provides additional information.
The performance of the IE and eIE methods was compared with that of various entropy methods [the PE, DE, SpeE, and EMDeE] [4], [5], [6], [7], [8]. To elucidate the characteristics of the IE and eIE methods, five types of synthetic signals with a sampling rate of 150 Hz and length of 100 s were employed (Fig. 3). The details of the synthetic signals are described in the Supplemental Materials. In the IE and eIE methods, the entropy value of each point was measured. In the PE (m = 3), DE (c = 6, m = 3, and d = 1, mapping method: normalized cumulative distribution function), SpeE, and EMDeE (decomposed by EMD), a sliding window of 1000 points with 80% overlap with the signal was used [ Fig. 3(a)-(e)]. As illustrated in Fig. 3(a), the EMDeE, IE, and eIE clearly captured slight phase shifts; however, the entropy at the phase shift point was low only in the IE method. This phenomenon might be attributable to mode mixing. As displayed in Fig. 3(b), the DE and PE failed to capture the complexity of quasiperiodic  signals, which have higher complexity than does the sine wave. In addition, the SpeE, IE, and eIE exhibited high entropy values at the concatenated point. As displayed in Fig. 3(c), the PE and DE values uniformly increased with frequency, and the SpeE method failed to capture the frequency modulation. Because only one IMF could be extracted through EMD, the IE and EMDeE methods only captured the zero value. Furthermore, Fig. 4 illustrates why some parts of the signal measured had higher eIE than did other parts of the signal. The eIE not only accounts for the signal frequency but is also used to determine the segments at the frequency changes that are affected by the compositions of input signal. CEEMD can be used to extract the compositions of the input signal. Such a determination of the transition part and high frequency indicate higher complexity. As illustrated in Fig. 3(d), DE and PE did not reflect amplitude modulation but IE and eIE did, and only eIE reflected frequency modulation. The EMDeE method exhibited similar performance to the IE method. In the IE and eIE methods, entropy tended to increase after t = 80 s, at which point amplitude modulation of the segment was more frequent. As depicted in Fig. 3(e) and (f), when the size of the sliding window was inappropriate, the DE, PE, SpeE, and EMDeE methods failed to capture the high complexity of the peaks of complex intermittent signals. Some methods only captured complexity when the sliding window's size was appropriate [ Fig. 3(g) for DE and Fig. 3(h) for PE]. The IE method failed to measure the complexity because intermittent signals could not be decomposed into appropriate IMFs through EMD. CEEMD is designed to address this problem; therefore, the eIE value was higher in the more complex parts of the intermittent signal. For the five adopted synthetic signals, the PE, DE, SpeE, and EMDeE required an appropriate predetermined sliding window size to be set for complexity measurement, whereas the IE and eIE methods did not require setting of the sliding window size. Thus, the IE and eIE methods can be used to measure the instantaneous complexity of time series and to conduct a direct comparison of the entropy value with the pattern of input raw signal at time t, which does not have any phase shift.
To determine whether each entropy method could detect the segment periodicity, a logistic map was employed. A signal with a sampling rate of 150 Hz and length of 100 s was generated. The series was generated by the following equation: where t was varied between 1 and 100 s in intervals of 100 15000 s. The term γ is the parameter of the logistic map and varied from 3.5 to 3.99 in the present study. When 3.5 < γ < 3.57, the periodic time series of the signal had low entropy. The rest of the signal was a chaotic time series. However, when γ was approximately 3.8, a notable periodic series occurred [17]. A sliding window of 60 points with 80% overlap with the signal was employed for each entropy method except for the IE and eIE methods. The other settings for each entropy method were the same as those detailed in the aforementioned text. In the investigation conducted with a logistic map, only the IE, eIE, PE, and DE methods were focused on because the EMDeE and SpeE methods are only suitable for long periodic series. The logistic map time series and each type of entropy series are displayed in Fig. 5(a). In each method except the IE method, the entropy was smaller for the long periodic series than for the other parts of the signal (γ ≈ 3.5 − 3.57 and γ ≈ 3.8). In the IE method, in these two periodic series parts, the entropy first increased and then decreased. This reversal might have been caused by the limited number of local extrema (mode-mixing problem) and the boundary effect, which are common problems in EMD analysis and can be addressed through CEEMD. For γ ≈ 3.5 − 3.57 in logistic map time series, although this segment was a periodic time series, the amplitude increased. Increasing amplitude in this segment was only observed in the eIE method. As illustrated in Fig. 5(b) and (c), in the case of the PE, DE, IE, and eIE, smaller entropy values were observed for the periodic parts of the signal than for the other signal parts. However, for considerably short periodic series, only the IE and eIE could detect the periodic parts [i.e., t ≈ 44.1 in Fig. 5(d) and t ≈ 21.1 and 21.2 in Fig. 5(f)]. To detect periodic parts in a logistic map time series by using conventional methods, an appropriate window size and overlap ratio must be set. In the IE and eIE methods, the signal regularity can be calculated directly from the length of the entire signal.
To understand the dependence of the time cost of each method on the signal length, the time cost of each method was determined under different signal lengths from 40 to 16000 points [ Fig. 6]. The parameter settings for each method were the same as those adopted in the aforementioned experiment. Although the eIE method exhibited the highest time cost among the compared methods, its execution time was only 2.8 s when the signal length was 16000 points. Moreover, when conducting CEEMD by using parallel computing, the time cost of the eIE method was close to those of the IE and EMDeE methods. Some issues should be noted. Although the IE method is theoretically a parameter-free method, some criteria, such as the stop criterion of the sifting process in EMD [9], [11], [12], can be applied to increase the efficiency of EMD. Many EMD-like methods, such as empirical wavelet transform [18] and variational mode decomposition [19], have been proposed. Most of them are also suitable for adaptively decomposing a time series to determine its features or resolving the mode-mixing problem and boundary effects. Future studies should test whether the IE or eIE method can achieve better results with the aforementioned methods. Another problem is that most EMD-related studies have selected suitable IMFs for further processing by using different methods [9], [20], [21], [22], [23]. In the present study, we used a significance test to select IMFs to calculate entropy [22], [23]. To obtain superior results, the optimal method for IMF selection should be identified. Moreover, the maximum entropy can be measured using (6) when a 1 (t) = a 2 (t) . . . = a n (t). Therefore, IE and eIE can be normalized through division by − n k = 1 1 n ln 1 n , which is equal to ln n. However, after normalization, the effect of the number of IMFs is eliminated. In our opinion, if additional IMFs can be extracted, raw signals with higher complexity can be obtained. Therefore, whether normalization is suitable for the IE and eIE methods requires further investigation.