Project-based Multi-tenant Container Registry For Hopsworks

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: There has been a substantial growth in the usage of data in the past decade, cloud technologies and big data platforms have gained popularity as they help in processing such data on a large scale. Hopsworks is such a managed plat- form for scale out data science. It is an open-source platform for the develop- ment and operation of Machine Learning models, available on-premise and as a managed platform in the cloud. As most of these platforms provide data sci- ence environments to collate the required libraries to work with, Hopsworks provides users with Anaconda environments.Hopsworks provides multi-tenancy, ensuring a secure model to manage sen- sitive data in the shared platform. Most of the Hopsworks features are built around projects, each project includes an Anaconda environment that provides users with a number of libraries capable of processing data. Each project cre- ation triggers a creation of a base Anaconda environment and each added li- brary updates this environment. For an on-premise application, as data science teams are diverse and work towards building repeatable and scalable models, it becomes increasingly important to manage these environments in a central location locally.The purpose of the thesis is to provide a secure storage for these Anaconda en- vironments. As Hopsworks uses a Kubernetes cluster to serve models, these environments can be containerized and stored on a secure container registry on the Kubernetes Cluster. The provided solution also aims to extend the multi- tenancy feature of Hopsworks onto the hosted local storage. The implemen- tation comprises of two parts; First one, is to host a compatible open source container registry to store the container images on a local Kubernetes cluster with fault tolerance and by avoiding a single point of failure. Second one, is to leverage the multi-tenancy feature in Hopsworks by storing the images on the self sufficient secure registry with project level isolation.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)