A Comparative Evaluation of Failover Mechanisms for Mission-critical Financial Applications in Public Clouds

University essay from Umeå universitet/Institutionen för datavetenskap

Abstract: Computer systems can fail for a vast range of reasons, and handling failures is crucial to any critical computer system. Many modern computer systems are migrating to public clouds, which provides more flexible resource consumption and in many cases reduced costs, while the migration can also require system changes due to limitations in the provided cloud environment. This thesis evaluates a few methods of achieving failover when migrating a system to a public cloud, with the main goal of finding a replacement for failover mechanisms that can only be used in self-managed infrastructure. A few different failover methods are evaluated by looking into different aspects of how each method would change an existing system. Two methods using \textit{etcd} and \textit{Apache ZooKeeper} are used for experimental evaluation where failover time is measured in two simulated scenarios where the primary process terminates and a standby process needs to be promoted to the primary status. In one scenario, the primary process is not able to notify other processes in the system before terminating, and in the other scenario, the primary process can release the primary status to another instance before terminating. The etcd and ZooKeeper solutions are shown to behave quite similarly in the testing setup, while the ZooKeeper solution might be able to achieve lower failover time in low-latency environments.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)