Skip to Main Content
In large-scale cloud computing systems, the growing scale and complexity of component interactions pose great challenges for operators to understand the characteristics of system performance. Performance profiling has long been proved to be an effective approach to performance analysis; however, existing approaches do not consider two new requirements that emerge in cloud computing systems. First, the efficiency of the profiling becomes of critical concern; second, visual analytics should be utilized to make profiling results more readable. To address the above two issues, in this paper, we present P-Tracer, an online performance profiling approach specifically tailored for large-scale cloud computing systems. P-Tracer constructs a specific search engine that adopts a proactive way to process performance logs and generates particular indices for fast queries; furthermore, PTracer provides users with a suite of web-based interfaces to query statistical information of all kinds of services, which helps them quickly and intuitively understand system behavior. The approach has been successfully applied in Alibaba Cloud Computing Inc. to conduct online performance profiling both in production clusters and test clusters. Experience with one real-world case demonstrates that P-Tracer can effectively and efficiently help users conduct performance profiling and localize the primary causes of performance anomalies.