Predicting resource usage on a Kubernetes platform using Machine Learning Methods

University essay from Karlstads universitet/Institutionen för matematik och datavetenskap (from 2013)

Abstract: Cloud computing and containerization has been on the rise in recent years and have become important areas of research and development in the field of computer science. One of the challenges in distributed and cloud computing is to predict the resource utilization of the nodes that run the applications and services. This is especially relevant for container-based platforms such as Kubernetes. Predicting the resource utilization of a Kubernetes cluster can help optimize the performance, reliability, and cost-effectiveness of the platform. This thesis focuses on how well different resources in a cluster can be predicted using machine learning techniques. The approach consists of 3 main steps: data collection, data extraction and pre-processing, and data analysis. The data collection step involves stressing the system with a load-generator called Locust and collecting data from Locust and collecting data from Kubernetes with the use of Prometheus. The data pre-processing and extraction step involves extracting relevant data and transforming it into a suitable format for the machine learning models. The final step involves applying different machine learning models to the data and evaluating their accuracy. The results of this thesis illustrate that machine learning can work well for predicting resources in a cluster based on how stressed the system is and that the best performing machine learning model tested was Support Vector Machine with a polynomial kernel.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)