Structure of proposed TFTOps model. TFTOps utilizes probabilistic input and metadata to achieve state-of-the-art performance in DevOps metric prediction.
Abstract:
This paper introduces a new generic and scalable framework for large-scale time series prediction and unsupervised anomaly detection. The most common approach of state-of...Show MoreMetadata
Abstract:
This paper introduces a new generic and scalable framework for large-scale time series prediction and unsupervised anomaly detection. The most common approach of state-of-the-art time series anomaly detection techniques, which are mostly based on neural networks, is to train a network per time series. However, a typical modern microservice system consists of hundreds of active nodes/instances. To monitor the performance of such a system, we often need to keep track of thousands of time series describing different aspects of the system, including CPU usage, call latency, and workloads. We introduce a new methodology for grouping metrics that share the same type, predicting hundreds of metrics concurrently with a single neural network model with shared parameters. The model also integrates the probabilistic representations and Temporal Fusion Transformers for better performance. In a real-world dataset, our proposed model achieved up to 50% improvement in terms of MSE.
Structure of proposed TFTOps model. TFTOps utilizes probabilistic input and metadata to achieve state-of-the-art performance in DevOps metric prediction.
Published in: IEEE Access ( Volume: 12)