Performance Evaluation of Kubernetes Autoscaling strategies on GKE clusters

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Cloud computing and containerisation have experienced significant growth in recent years. With cloud providers requiring users to specify resource limits and requests, the need for performance and resource optimisation has emerged in the cloud computing domain. This thesis focuses on examining three autoscaling approaches in the Kubernetes container orchestrator: Hybrid Pod Autoscaler, Vertical Pod Autoscaler (VPA), and Horizontal Pod Autoscaler (HPA). To conduct the analysis, a production-grade microservice was deployed on a GKE cluster, replicating the workload of the host company Nordnet Bank AB, a pan-Nordic platform for savings investments. The main objective was to investigate the impact of the different autoscalers on the 50th and 99th percentile response times. The study also aimed to investigate whether a hybrid pod autoscaler, combining VPA and HPA, could outperform HPA and VPA in terms of response time and CPU usage. Additionally, the study aimed to identify the service metrics that an orchestrator can use to achieve response times similar to those obtained when resources are over-provisioned. The research findings indicate that response times varied significantly depending on the autoscaling strategy. While the 50th percentile response times remained consistent, the 99th percentile exhibited greater variation. Among the strategies, HPA demonstrated consistent performance, albeit with greater variability in the 99th percentile response times. The VPA strategy, in contrast, resulted in higher response times for both the 50th and 99th percentile compared to the baseline. The hybrid approach generally outperformed VPA in terms of response times while showing comparable performance to HPA, although with slightly greater variability. CPU usage patterns of the hybrid approach were more closely aligned with HPA than VPA. CPU usage and request rate were effectively used as service metrics for orchestrators in achieving acceptable 99th percentile response times, as demonstrated by both HPA and the hybrid approach. Nevertheless, these findings are contingent on the specific autoscaler configuration, microservice, and workload model used in this study and may not be universally applicable.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)