Characterization of the Ethanol-water Blend by Acoustic Signature Analysis in Ultrasonic Signals

The use of ethanol as fuel in Brazil stimulated the competition between distribution companies and resellers, which aggravated the practice of adulteration of fuels, aiming for illicit gains and tax evasion. The most common practice of adulteration in fuel alcohol is the addition of water. The classic techniques for measuring the water content in ethanol offer good precision and good detailing as to the presence of water. However, they present disadvantages such as the need for sample collection, long analysis time, in addition to the need for specialized laboratory and labor. This work aims to propose digital signal processing techniques to analyze and quantify the presence of water in ethanol fuel using a combination of Principal Component Analysis (PCA), and Singular Spectrum Analysis (SSA) applied on ultrasonic signals. This method resulted in the proposal of a new score, which relates to the ethanol/water ratio information present in the mixture. The results were promising when relating the proposed score to the presence of moisture in ethanol to a greater or lesser degree. Experiments performed prove the technique’s feasibility and pave the way for a new method for real-time monitoring.


I. INTRODUCTION
Ethanol is an alcoholic substance widely used as a fuel or additive to gasoline in the fuel industry. Due to its high solubility in water, the fast and effective determination of the water content in ethanol is essential to guarantee the quality of the product. The alcohol content below the allowable makes combustion more difficult due to the greater amount of water present in the fuel. Levels above the established increase the volatility of the fuel and can cause problems in the functioning of engines. As ethanol is added to water, the interactions between the molecules lead to changes in the molecular structure of the compounds [1]. As a result, there is a variation in the final sample volume. Due to volume dependence, some physical-chemical parameters, including the speed of sound in the sample and the density, show a nonlinear behavior depending on the ethanol concentration in the mixture [2]. The classic technique, like Karl Fischer's titration, is used for measuring the water content in ethanol [3], offering good precision and good detailing as to the presence of water. However, they present disadvantages, such as the need for sample collection, long analysis time, in addition to the need for specialized laboratory and labor.
Ultrasound technology has been used to evaluate the composition of materials, such as insulating oil [4]- [5], fuel oil [6]- [7], foods, among others. There is a relationship between the ultrasonic velocity and the composition of the mixtures at a given temperature [8]. Since the change in ultrasonic velocity with temperature differs from substance to substance, measuring the ultrasonic velocity in a mixture at two temperatures would supply information about the volume fraction of the ingredients.
This work intends to find ways to discover the adulteration in ethanol fuel through the acoustic signature in the singlesource ultrasound signal, with the measurement at only one temperature, and in a non-invasive way to allow online applications. The main objective of this work is to find a mechanism to characterize and classify the percentage of water present in ethanol fuel using digital signal processing techniques. The parameters used as the acoustic signature of the mixture are obtained by combining digital signal processing techniques applied to the signal captured from an ultrasonic pulse that passed through the sample to be classified. The method used in the development of this focuses on applying a set of mathematical techniques in ultrasonic signals to obtain these parameters. These parameters will be used to classify the proportions of moisture in ethanol-water blends.
This work proposes a new methodology to reveal parameters in the acoustic signature of a mixture, based on the digital manipulation of data from ultrasonic transducers associated with a combination of mathematical techniques, such as Singular Value Decomposition (SVD), Singular Spectrum Analysis (SSA) and Principal Component Analysis (PCA), successfully used to estimate the water ratio in the ethanol-water mixture.
The paper is organized as follows: the related works used as a basis for its development are presented in Section II. In Section III, the mathematical concepts of SVD, PCA, and SSA are briefly presented and the proposed methodologies. In section IV, the proposed methodologies are applied in conducting experiments to generate data to prove the hypothesis of this work. A brief discussion of the results obtained in the experiments is also presented. Finally, the relevant conclusions are made in Section V.

II. RELATED WORKS
Veloso [9] proposed the PCA analysis through scores on acoustic signals from partial discharges in power transformers to find patterns of contamination in insulating oil through parameters obtained with the Discrete Wavelet Transform. A system using the same techniques in ultrasonic signals that can evaluate the contamination by moisture in insulating oil of power transformers was developed by Noronha [4]. Ultrasonic techniques were also applied in the fuel industry by Teixeira et al. [6] for the online monitoring of the degree of water contamination in heavy fuel oil, using the FFT of ultrasonic pulses captured by the sensor, after passing through the liquid having the mixture, as input data for the PCA. A principle like this is used to find a pattern in the acoustic signature of ultrasonic signals collected in water-ethanol fuel mixtures at different degrees of contamination. Thus, is used a combination of Fourier Transform and SSA parameters applied in the PCA.
Studies related to the analysis of ethanol-water mixtures have also been reported using ultrasound techniques. Vatandas and Koc [8] performed the identification of the type of alcohol and the volume concentration in binary waterethanol/methanol mixtures using ultrasonic velocity measurements. D'Arrigo and Paparelli [10] analyzed ultrasonic velocity measurements in aqueous ethanol solutions from −40°C to +30°C over the entire range of composition of the mixture (0%-100% v/v), and in the range of frequency of 10-70 MHz. As a result, they proposed a model to explain the volumetric properties of ethanol mixtures at low temperatures and low solute concentration ranges. Possetti et al. [11] proposed a heterogeneous measurement system to determine the concentration of ethanol-based solutions on optical fibers and ultrasonic sensors to solve the ambiguous problem in the measures presented by the water-ethanol mixture. The application of a transducer to check the quality of biofuels, combining optical fibers and ultrasound, was proposed by Kawano [1]. The works cited show that the ethanol-water mixture presents an ambiguous behavior about its physicochemical properties and proposes hybrid mechanisms to solve this nonlinearity. This work seeks to find a mechanism to resolve this ambiguity and nonlinearity in measurements using the acoustic signature revealed by digital signal processing techniques.
Maddirala and Shaik [12] introduced the SSA technique combined with Independent Component Analysis (ICA) to separate sources from single-channel EEG signals, proposing a new way of grouping for reconstructed SSA signals. Kuang et al. [13] proposed an efficient and adaptive denoising method based on the multistage SSA. Multistage SSA applies basic SSA with small window length recursively to the noisy signal. As an adaptive stop criterion, they used a correlation measure. Moreover, the method exploits the use of SVD on small matrices, making it more efficient. These studies show that new forms of grouping in SSA and even elementary grouping can be exploited to separate sources into signals in an optimized way.
Fu et al. [14][15] successfully used a combination of techniques of PCA and SSA for feature extraction and data classification in hyperspectral imaging, proving to be a method with better performance when a small set of training is available. Compared to other state-of-the-art techniques, they still obtained a higher classification and accuracy in their proposed method, proving that the combination of these techniques has great potential to be explored.

III. PROPOSED METHODOLOGY
The proposed method is based on the combination of SSA-PCA techniques to derive an expression that relates the acoustic signature to the ethanol-water ratio.

A. MATHEMATICAL PRELIMINARIES
The SVD is the mathematical tool used in the PCA and SSA techniques, which are part of the signal analysis method exploited to reveal the acoustic signature. Properties related to the presence of water in the ultrasonic signal applied to the ethanol-water mixture are discussed below.
The Singular Value Decomposition (SVD) of a matrix is a core matrix decomposition method in linear algebra. It is referred to as the "fundamental theorem of linear algebra" [16] because it can be applied to all matrices, not just square matrices, and always exists. The SVD of a matrix A, with rank r, is a linear transformation Φ: V → W, where V ∈ ℝ and W ∈ ℝ , and can be interpreted as to its decomposition into three operations. In general, SVD performs a change of basis via V T followed by a scale and an increase (or reduction) in dimensionality through the matrix of singular values Σ. Finally, it performs a second base change via U. In addition, the bases are orthonormal, the best type of base possible.
The SVD Theorem [16] can be written in the form: where: σi is a scalar, ui is a m-dimensional column vector, and v i is a n-dimensional column vector, each is an m×n matrix, and the SVD equation decomposes matrix A into r matrices with the same shape (m×n).

2) SSA
Singular Spectrum Analysis (SSA) is a method for decomposing time series and can be used efficiently to decompose signals to categorize their oscillation signatures over time [17]. It is a time series analysis technique that incorporates the elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamic systems, and signal processing. This method helps analyze data with complex seasonal patterns and non-stationary trends, mainly where single-channel measurement is available. The SSA technique is a non-parametric method. One of its advantages is that it can be used without any assumptions, such as stationarity and normality of data [18]. This technique has been used successfully in different areas of time series analysis, such as biological signals such as EEG and ECG ( [12], [19][20][21][22][23]), climatic and hydrological data [24], acoustic signals [25][26][27][28], economic and financial data and application on radars [29]. It consists of two complementary stages: decomposition and reconstruction. Each stage can be divided into two steps, as illustrated in Fig. 1, and these stages are discussed below. Consider a time series with real value Y = YN = (y1, ..., yN) of length N. Suppose that N > 2 and Y is a non-zero series; that is, there is at least one i such that yi ≠ 0. Let L (1 < L < N) be an integer called window length and K = N -L + 1.
The one-dimensional time series Y is mapped into a multidimensional series by sliding a window of length L over the observed data YN of size N. A matrix is produced by stacking the signal segments that may have overlapping L-1. This procedure is called embedding and results in the trajectory matrix with dimensions of L by K. The embedding can be considered a mapping that transfers a one-dimensional time series YN = (y1, ..., yN) to the multidimensional series x1, ..., xK with vectors xi = (yi, ..., yi+L−1) T ∈ R L . The xi vectors are called L-lagged vectors (or, simply, lagged vectors). The result of this step is the trajectory matrix X (2), which is a Hankel matrix, which means that all the elements along the diagonal, i + j = constant, they are equal.
The second step, the singular value decomposition step (SVD), decomposes the trajectory matrix X into its orthonormal bases and represents it as a sum of elementary bi orthogonal matrices of rank = 1. Denote λ1, ..., λL as the eigenvalues of CX = XX T in decreasing the order of magnitude (λ1 ≥ ... λL ≥ 0) and u1, ..., uL the corresponding eigenvectors forming an orthonormal system i.e., <ui, uj> = 0 for i ≠ j (the orthogonality property) and ǁ ui ǁ = 1 (the unit norm property). <ui, uj> is the internal product of the vectors ui and uj and ǁ ui ǁ is the norm of the vector ui. Since d = rank (X) = max (i, where λi> 0), and denoting = √ ⁄ , then the SVD of the trajectory matrix X can be written as: where: = √ = ( = 1, ⋯ , ). Matrices Xi have rank 1, so they are elementary matrices, ui and vi represent the left eigenvectors and right eigenvectors of the trajectory matrix X. In SSA literature, these eigenvectors are commonly called "empirical orthogonal functions" (EOF) and "principal components" (PC), respectively. The term uj u T j forms a subspace for the jth component present in X. Mapping X in that subspace results in the trajectory matrix of the jth component in X. In other words, uj is the base vector to extract the jth component present in the time series Y [16].
The collection (√ , , ) is called the eigentriple of the matrix X, √ ( = 1, ⋯ , ) are the singular values of the matrix X, and the set {√ } =1 is called the spectrum of matrix X. If all eigenvalues have multiplicity one, then the expansion (3a) is uniquely defined.
The SVD of X is based on the spectral decomposition of the covariance matrix CX ∈ ℝ x . It should be noted that the CX matrix is symmetric, positive definite, or positive semidefinite. Consequently, it has a complete set of eigenvectors and can be diagonalized in the form UΣU T , where Σ is the diagonal matrix L×L of the eigenvalues and U = (u1, ..., uL) is an orthogonal matrix of eigenvectors of the CX matrix. The name "singular spectrum analysis" comes from this property of the SVD technique and is a vital part. The whole process focuses on obtaining and analyzing this spectrum of singular values to find and differentiate between signal and noise in each time series.
b) Reconstruction: The reconstruction stage includes grouping and diagonal average operations to recreate the onedimensional time series. Once the expansion (3b) is obtained, the grouping procedure partitions the set of indices {1, ..., d} into m disjoint subsets I1, ..., Im. Let I = {i1, ..., ip}. Then, the resulting matrix XI corresponding to group I is defined as XI = Xi1 + ... + Xip. The resulting matrices are calculated for groups I = I1, ..., Im and the expansion (3b) leads to decomposition The grouping step corresponds to dividing the elementary matrices into several groups and the sum of the matrices within each group. For a given group I, the contribution of component XI is measured by the participation of the corresponding eigenvalues: Several grouping criteria for the SSA technique are found in the literature, primarily based on the size of the eigenvalues of the covariance matrix Cx [18] [22]. Maddirala and Shaik [12] proposed an alternative form of grouping based on the frequency components of the eigenvectors of the covariance matrix Cx. This paper explores a grouping criterion that, in addition to considering the size of the eigenvalues, also considers the frequency information contained in the eigenvectors, combining the two methods of grouping the literature. Elementary grouping is also used.
After proceeding with the desired groupings, the next step for the signal reconstruction is calculating the diagonal average because the matrices obtained may not be in the form of a Hankel matrix. The diagonal average transforms a generic matrix into a Hankel matrix that can later be converted into a time series If zij is an element of a matrix Z of dimensions L × K, then the kth term of the resulting series is obtained by the mean of zij over all i, j such that i + j = k + 1. This corresponds to the average of the matrix elements on the antidiagonal i + j = k + 1: for k = 1 results z1 = z1,1 / 1, for k = 2, z2 = (z1,2 + z2,1) / 2, for k = 3, z3 = (z1,3 + z2,2 + z3,1) / 3 and so on. Let L* = min (L, K), K* = max (L, K) and N = L + K -1. Let zij* = zij also be L < K and zij* = zji otherwise. Making the diagonal average, the matrix Z is transferred to the series z(1), ..., z(N) using the formula: Applying the diagonal average to the matrices obtained through the SVD supplies reconstructed signals corresponding to the realized groupings. Therefore, the initial series x1, ..., xN is decomposed into a sum of m reconstructed series using the SSA technique. The reconstructed series produced by the elementary cluster is called the reconstructed elementary series.

3) PCA
Principal Component Analysis (PCA) is one of the most used tools in modern data analysis in several areas [9]. It supplies a method of finding patterns in data and expressing them to highlight their similarities and differences [7].
Given a set of data X, an mxn matrix with a zero-mean and standardized, where m is the number of different parameters and n corresponds to the number of measurements for each parameter. The principal components of X are the eigenvectors of its covariance matrix C . When calculating the SVD of X, the columns of matrix V hold the eigenvectors of C . Therefore, the columns of V are the principal components of X.
PCA seeks a linear combination of the original variables that produces orthogonal axes forming a new meaningful basis that can re-express the associated data set, filter noise, and reveal hidden information. It assumes that the data to be analyzed have a high signal-to-noise ratio (SNR). Principal components with more significant associated variances are exciting information, while those with lower variances represent noise or undesired information [6], [34][35].
A detailed description of how the Principal Component Analysis technique can be deduced and how it can be applied is found in Shlens [35]. It is possible to deduce the concept of PCA intuitively and then, through mathematical rigor, derive its algebraic solution by EVD and SVD.

B. PROPOSED SSA-PCA TECHNIQUES
The earlier section shows that the PCA uses the singular vectors on the right of SVD, vi, and the SSA uses the singular vectors on the left ui. As both form the orthonormal basis of A, this work explores techniques based on the projection of the X signal in both bases and compares them.
A combination of parameters from the PCA of the Fourier Transform and the SSA is used in this paper to search for patterns that can be used to classify the level of water contamination in ethanol.

1) INTERPRETATION OF THE PRINCIPAL COMPONENTS
As the eigenvectors of a covariance matrix have the characteristic of being orthogonal to each other, it implies data independence. It is seen that the principal components are orthogonal vectors in which the original data can be projected, and such projection reveals independent information present in the data. The interpretation of the principal components can be eased by defining scores as the projection of the original data in the eigenvectors of the covariance matrix, i.e., in the principal components. Each data set will generate components that must be analyzed to get to any interpretation. The search for patterns is done by associating each of the principal components to each of the scores, and it done using a bar graphs.

2) APPLICATION OF THE PCA TECHNIQUE TO THE RECONSTRUCTED ELEMENTARY SIGNALS OF THE SSA
The SSA technique decomposes a time series into elementary signals called reconstructed signals (RCi). To find patterns in the ultrasonic data collected, the SSA technique was used to decompose the pulse into its elementary spectral components associated with the singular vectors on the left (Columns of the U matrix, ui). The Fourier Transform was applied to each reconstructed signal to get the parameters for PCA.
The block diagram of this approach is shown below in Fig.  2.

FIGURE 2. Methodology of application of the PCA analysis on the parameters of the FFT of the reconstructed signals.
This method decomposes the digital signal corresponding to the ultrasonic pulses using SSA and the Fourier Transform. All frequency bins are considered the input parameters of the PCA. In this way, it is found which principal component has a linear behavior and can separate the acoustic signatures obtained from mixtures with different concentrations of ethanol-water.

3)PROJECTION OF THE ORIGINAL SIGNAL ON THE BASE FORMED BY THE RECONSTRUCTED SIGNALS USING THE SSA
The SSA technique can decompose a time series X representing a single channel signal into a sum of L reconstructed elementary signals. Each reconstructed elementary signal (RCi) is a component of the original signal X revealed by the SVD incorporated into the SSA [16][18] [30]. As each component is obtained by projecting the original signal onto the left orthogonal vectors (ui) of the SVD, they are linearly independent [30]. This paper proposes a new scoring model based on input data projection on the base formed by reconstructed elementary signals by the SSA technique. In analogy to the score based on the principal components, a score based on the reconstructed elementary signals (RCi) obtained in the reconstruction of the time series due to the application of the SSA technique is submitted. Each obtained component is put in a column of the RC matrix. After being orthogonalized by the Gram Schmidt process, each project the original signal and obtains the scoreRC parameter. In matrix form, this proposed new score, called scoreRC, can be written as: where the RC matrix has its columns composed by the reconstructed elementary signals of the SSA, after being orthogonalized by the Gram Schmidt process, and X is the time series holding the ultrasonic pulse captured by the acquisition system after crossing the mixture and aggregating information about the medium.
The proposed scoreRC solves the problem of nonlinearity, observed in the PCA score for certain training sets, noticed during the performed experiments.
The block diagram of this approach is shown below in Fig.3.

IV. EXPERIMENTS AND RESULTS
Based on the methodologies described in the previous sections, several experiments were performed with ultrasonic signals obtained from the ethanol-water mixture in various concentrations.

A. EXPERIMENTAL SETUP
This section presents the system's characteristics used to obtain the ultrasonic signals used in this work and essential information about the collected signals. The test bench description can be found in Noronha [4]. A schematic drawing is shown in Fig. 4. Two ultrasonic transducers of the single crystal type S9208-AF84, from Physical Acoustic Corporation, are used. They work in pulsed mode, instrumenting a cuvette, one for transmitting and the other for receiving the signal. Both are positioned facing the liquid of the mixture. Each pulse applied to the transmitter is collected by the receiver after going through the liquid phase mixture. The resulting analog signal is converted to digital using a National Instruments system, with eight independent and simultaneous channels, variable sampling up to 2.5MSamples/s, and 14-bit resolution at -10V to +10V. Two channels were used, one for the acoustic signal and another for the trigger timing.
A series of mixtures with proportions, as shown in Table I, were prepared from an initial volume of ethanol.
A volume is placed in the stirrer. It is then drained to the measuring cuvette until it is filled. The first measurement is made. In each measurement, ten signals are collected. The cuvette is then emptied, and a new flow is made from the initial volume until filling the cuvette again. Another measurement is made. This is repeated until the fifth measurement. After that, a new volume with the new proportion to be measured is done by adding a certain amount of water to give the desired proportion, and the whole process is repeated.
The collected signals are composed of 100,000 points sampled at 2.5 MHz, having a certain number of acoustic pulses (Fig. 5 (a)). Thus, the first step in treating these signals is to extract each pulse. An algorithm based on the detection of the signal envelope is applied, and it results in signals like the one in Fig. 5 (b), where 4096 points are needed to record the part of the interest of each ultrasonic pulse. Under these conditions, the frequency resolution of the Fourier Transform is 610.35 Hz. Fig. 5(c) illustrates part of the frequency spectrum of a normalized ultrasonic pulse. For the energy contained in the pulse to be unitary, they are normalized by multiplying each sample of the signal by the inverse of the square root of its energy value.
Applying the SSA technique to the pulse shown in Fig. 5  (b), the reconstructed elementary signals (RCi) are obtained, and the first six are shown in Fig. 6.
Each pulse is considered an experiment performed to be applied in the two techniques proposed in this work. The first technique decomposes the digital signal corresponding to the ultrasonic pulses using SSA and the Fourier Transform. All frequency bins are considered the input parameters of the PCA. In a second approach, the SSA will generate the reconstructed elementary signals corresponding to the ultrasonic pulse, which, represented in matrix form, are used to obtain a score for the % of water in the mixture.

B. SAMPLE CLASSIFICATION
The classification of the samples is shown by applying the PCA technique on samples resulting signals from ethanolwater blends with quite different levels of water. The signals that are obtained from mixtures A and M in Table I are used in this experiment. For each of the two contamination levels, 100 pulses are analyzed. The parameters obtained by applying the Fourier Transform on the pulses are considered column vectors and placed in the matrix for the PCA analysis. The reconstructed elementary signals (RCi) are placed in columns of the matrix to generate the proposed scoreRC.
Applying the methodology described in III.B.2, a clear separation between the first 100 pulses from mixture A and the subsequent 100 pulses from mixture M is obtained, as shown in Fig. 7. This same classification using the scoreRC defined in (7) is obtained as shown in Fig. 8, where different levels are evident between the 100 pulses corresponding to each level of contamination, in this case, about the first elementary signal reconstructed with the SSA technique. From these results, the same techniques are applied to ultrasonic signals obtained from different proportions of the ethanol-water mixture described in Table I, and their results are discussed below.

C. ESTIMATION OF THE PROPORTION OF WATER IN THE ETHANOL-WATER MIXTURE
Experiments are conducted with different proportions in the mixture to analyze different ranges of moisture contamination in ethanol. The first experiment for a narrow range of moisture contamination ranging from 0 to 5% of 1 in 1% is represented by mixtures A to F in Table I. The second experiment in mixtures with water proportions ranging from 0 to 35% every 5% and a third combining the signals used in the two initial experiments. Each of them is detailed in the next sections. As input data for the proposed methods, 50 ultrasonic pulses of each level of water contamination in ethanol is used, corresponding to six different ethanol-water proportions with the addition of 0%, 1%, 2%, 3%, 4%, and 5% water for the volume of ethanol, referring to mixtures A to F in Table I. The complete score analysis on PC#1 arising from the FFT parameters of ultrasonic pulses has shown a direct relationship between the percentage of water in the ethanol-water mixture, as shown in Fig. 9. The results for the scoreRC, represented in Fig. 10, show that the separations between the levels can be seen unequivocally for the third and fourth elementary signals. The fourth elementary signal is the one that would present the best results, as it shows the most significant difference of value between the levels obtained for each group in Table I, used in this experiment, allowing a better approximation in the linear regression.
The scoreRC related to the first and second elementary signals have ambiguity in the threshold values, as shown in Fig. 10.

2) ADDITION OF WATER FROM 0 TO 35% IN 5% STEPS
In this experiment, 50 ultrasonic pulses of each level of water contamination in ethanol, corresponding to eight different ethanol-water proportions with the addition of 0%, 5%, 10%, 15%, 20%, 25%, 30%, and 35 % water in the volume of ethanol. The pulses are obtained by applying the ultrasonic signal at each level, represented by Mixtures A and F-L in Table I.
The complete analysis of the score on PC#1 of the parameters resulting from the FFT of the ultrasonic pulses proved a direct relationship between the percentage of water in the ethanol-water mixture, as can be seen in Fig. 11. Pulses 1 to 50 come from the ethanol sample without adding water (Mixture A). The level obtained with pulses from 51 to 100 comes from the score of the pulses captured in Mixture F with the addition of 5% water to the volume of ethanol. The visible levels of 101 to 150, 151 to 200, 201 to 250, 251 to 300, 301 to 350, and 351 to 400 are, respectively, derived from the addition of 10%, 15%, 20%, 25%, 30%, and 35% water in the volume of ethanol, represented by mixtures of G to L. The analysis in the scoreRC shows that the separations between the levels can be seen unambiguously for the second elementary signal, where a single value represents each proportion of water addition in the ethanol-water mixture, as seen in Fig. 12. In this experiment, 50 pulses from each of Mixtures A to L in Table I were used, being the 12 proportions of water that were used in the mixture with ethanol, which corresponds to the addition of 0%, 1%, 2%, 3%, 4%, 5%, 10%, 15%, 20%, 25%, 30%, and 35% of water in the volume of ethanol. Fig. 13 illustrates the values of the score of the parameters coming from the FFT of the ultrasonic pulses in PC#1. It is possible to observe different proportions of water in the ethanol-water mixture related to levels of the same value, presenting an ambiguity for mixtures A and C, making it impossible to classify with this technique. In Fig. 14, pulses 1 to 50 come from the ethanol sample without adding water (Mixture A). The level obtained with pulses from 51 to 100 comes from the scoreRC for pulses captured in Mixture B, which has a proportion of 1% of water addition in the volume of ethanol. The subsequent levels were obtained using the pulses obtained from mixtures C to L, which represent, respectively, the proportions of 2%, 3%, 4%, 5%, 10%, 15%, 20%, 25%, 30%, and 35% of addition of water in the volume of ethanol.
Unlike what occurred in the FFT PCA analysis seen in Fig.  13, using the analysis of scoreRC, the threshold for the proportions of water in the ethanol-water mixture of the second elementary signal present uniqueness in values, enabling the classification of pulses in this hybrid set of proportions of water in the ethanol-water mixture.

D. SAMPLE CLASSIFICATION
Observing Fig. 7 to 14 presented in the earlier section, it is possible to prove the feasibility of applying the proposed techniques to classify the proportions of water in ethanol fuel, both in a narrow range of contamination (0% to 5%, steps of 1%) and in a broader range (0% to 35%, steps of 5%). For both ranges, the average values of the score obtained with the PCA of each of the groups can be used. As each group is related to a ratio of water to ethanol, using linear regression, one can obtain an equation where the x-axis corresponds to the value of the score. The y-axis is the ethanol-water ratio.
It is possible to use the results presented in Fig. 14 to obtain a general expression combining the narrow and broad ranges. The average values of the scoreRCs of the second reconstructed elementary signal using Singular Spectrum Analysis (SSA) can be related to each contamination group. Likewise, each group is related to a proportion of water in ethanol, so one can also obtain an equation that relates the value of the scoreRC found with the ethanol-water ratio, allowing the classification of a particular sample.
The average scoreRC value for each group can thus be associated with the corresponding ethanol/water ratio value, as presented in Table II. Thus, it is possible to obtain a relation between these values. A third-order linear regression, shown in Fig. 15, results in the formula = −3.77398 3 + 4.77176 2 − 5.94885 + 7.45136 (8) where y corresponds to the water addition, given in %, and x corresponds to the scoreRC for the second reconstructed elementary signal.
The % of water present in the ethanol-water mixture can be estimated using (8).
With these results, it was possible to obtain a non-invasive method using digital signal processing techniques. The method is based on the analysis of ultrasonic signals that crossed the ethanol-water mixture in different proportions and undergone changes in its characteristics due to the presence of water in it. So, this method provides a basis for classification by the acoustic signature.

V. CONCLUSION
This work aims to contribute to monitoring the quality of ethanol fuel related to moisture contamination through noninvasive techniques that allow online monitoring. The initial objective of verifying the possibility of detection through the analysis of ultrasonic signals that came into contact with the mixture in the liquid phase and had its characteristics altered by the presence of water was achieved.
To identify and quantify the water levels in the ethanolwater mixture, a combination of the Principal Component Analysis (PCA) and Singular Spectrum Analysis (SSA) techniques were used. This method resulted in the proposal of a new score linked to the ethanol/water ratio information present in the mixture, which was used to classify the pulses of mixtures with different degrees of concentration.
The proposed methodologies aimed to determine which parameters were more sensitive to water in the mixture, making it possible to relate these acoustic signatures and obtain expressions that supply the ethanol/water ratio estimation.
Experimental results show that the proposed methodologies could indicate the moisture level in ethanol samples. The results obtained with the SSA algorithm led to an expression that links the scoreRC and the ethanol/water ratio, given in %, allowing the classification of a particular sample.
Mathematical transformations were also applied to the signals to obtain parameters for the analysis. The tests performed with the Fourier Transform proved to generate acoustic signatures for the classification of contaminated samples in steps of 1% and 5% in isolation. The results obtained with the PCA technique related the scores and the ethanol-water ratio. These same results were obtained by applying the SSA technique and, using the scoreRC proposed in this paper, i.e., the projection of the ultrasonic signal in the reconstructed elementary signals from the SSA.
A significant result was that when combining data from samples contaminated with steps of 1% with data from samples contaminated with steps of 5%, it was still possible to find parameters that could associate the proposed scoreRC with the ethanol/water ratio, in this case, the analysis of the PCA score of the Fourier Transform acoustic signature presents ambiguities in the results, not allowing its use to generate an expression that directly relates the score to the ethanol-water ratio.
An application of this method, in practice, would consist of equipment similar to the one used in the experiments, but more compact, which would be coupled to the fuel supply line of a machine/engine and take measurements during fueling. The sensors would be previously calibrated in the laboratory using the same method described in this article. The measurement results would allow evaluating whether the fuel is acceptable and even inferring its yield/efficiency. Another use could be portable equipment for on-site verification of fuel quality by collecting samples in tanks, presenting the result instantly. Finally, the proposed techniques also provide alternative methodologies for analyzing other contaminants and other mixtures.