Identification of Education Activity Based on Datalake Captured from Internal Sensor Data of Supercomputer | IEEE Conference Publication | IEEE Xplore

Identification of Education Activity Based on Datalake Captured from Internal Sensor Data of Supercomputer


Abstract:

Analysis of the nonlinear and nonstationary systems requires special methods, different from the classical Fourier Transform. Fast Fourier Transform, Wavelets, and Wavele...Show More

Abstract:

Analysis of the nonlinear and nonstationary systems requires special methods, different from the classical Fourier Transform. Fast Fourier Transform, Wavelets, and Wavelet Transform let researchers find temporal properties of the data series in different frequencies, time moments, and scales. In partitioning solutions of a single-dimensional nonstationary signal with Empirical Mode Decomposition or Variational Mode Decomposition, a set of intrinsic mode functions is created to represent each of them as a stationary component in specific frequency bands. Both decompositions are used to identify important characteristic features called modes of the initial process. These stationary modes can be analyzed with the Fourier Transform and applied classical statistical methods of the time series. Dynamic Mode Decomposition is proposed for the analysis of coherent multidimensional processes distributed in space and time simultaneously. Eigenvalues, eigenvectors, and modes are used to identify spatiotemporal structures of the nonstationary initial process. It is nontrivial to evaluate the scientific effect of the joint usage of these different methods. In this study, the multidimensional dataset captured from inside a supercomputer with 127 computation nodes was analyzed in three different operation queues for education and research. The dataset with 1.7 million samples was uploaded into the Kaggle database. Internal data from physical sensors contained the temperature of memory cards, processors, and systems. Internal data from logical sensors included the load of the node, the number of running processes, the number of input and output packets on the network interface card, and the load of the processors. The sampling period and duration of these data series were one minute and one week, respectively. It was found that the joint usage of the specific methods depended on the character of the dataset.
Date of Conference: 26-28 January 2024
Date Added to IEEE Xplore: 08 May 2024
ISBN Information:
Conference Location: Bangkok, Thailand

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.