Docker Orchestration for Scalable Tasks and Services

University essay from KTH/Skolan för informations- och kommunikationsteknik (ICT)

Author: Tobias Wiens; [2015]

Keywords: ;

Abstract: Distributed services and tasks, i.e. large-scale data processing (Big Data), in the cloud became popular in recent years. With it came the possibility, to scale infrastructure according to the current demand, which promises to reduce costs. However, running and maintaining a company internal cloud, which is extended to one or more public clouds (hybrid cloud) is a complex challenge. In the recent years, Docker containers became very popular with the promise to solve compatibility issues in hybrid clouds. Packaging software with their dependencies inside Docker containers, promises less incompatibility issues. Combining hybrid clouds and Docker Containers, leads to a more cost effective, reliable and scalable data processing in the cloud. The problem solved in this thesis is: how to manage hybrid clouds, which run scalable distributed tasks or services. Fluctuating demand requires adding or removing computers from the current infrastructure, and changes dependencies, which are necessary to execute tasks or services. The challenge is, to provide all dependencies for a reliable execution of tasks or services. Furthermore, distributed tasks and services need to have the ability to communicate even on a hybrid infrastructure. The approach of this thesis is, to prototype three different Docker integrations for Activeeon’s ProActive, a hybrid cloud middleware. Further, each of the prototypes is evaluated, and one prototype is improved to an early stage product. The software-defined networks weave and flannel are benchmarked, in their impact on the network performance. How Docker containers affect the CPU, memory and disk performance is analyzed through literature review. Finally, the distributed large-scale data processing software Apache Flink is benchmarked inside containers, to measure the impact of containerizing a distributed large-scale data processing software. The results of this thesis show that Docker container orchestration is feasible with ProActive and software defined networks (weave and flannel). While both show impact on the pure network performance, the Apache Flink benchmark did not reveal any impact of using containers and software defined networks. Therefore, Docker containers together with orchestration through ProActive are able to form a large-scale data processing platform.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)