Hybrid Auto-Scaling for an Asynchronous Computationally Intensive Application

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Arnþór Jóhann Jónsson; [2020]

Keywords: ;

Abstract: Cloud computing has developed rapidly in recent years. It enables users to access computing resources on an on-demand basis, acquiring resources when an application requires them and releasing them when they are no longer needed. This characteristic of cloud computing or elasticity as it is often referred to raises an interesting question of how to manage these resources efficiently, as there is a tradeoff between minimizing resource cost and meeting the Quality of Service (QoS) requirements. The choice of an auto-scaler can significantly affect both resource consumption and QoS in cloud applications. Container orchestration systems such as Kubernetes rely on static thresholds and low-level infrastructure metrics such as CPU and memory to scale cloud applications and foresee workloads that arise at runtime. This research presents a hybrid auto-scaling method using both reactive and proactive metrics for an asynchronous computationally intensive application used in a publish-subscribe pattern. The new hybrid method is compared to Kubernetes Horizontal Pod AutoScaler (HPA), both with synthetic and real-world workloads. Based on the experimental results, both methods are compared in regard to resource consumption and QoS. For the synthetic workloads, it used, on average, 39.44% fewer pods than HPA, and 52,97% increased CPU utilization, while providing a similar and often more reliable QoS than HPA. For the real-world workload, the method presented in this thesis used, on average, 47.26% fewer pods than HPA. Furthermore, it had a 73.78% increased CPU utilization, while averaging a significantly smaller queue and less latency compared to HPA.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)