Optimization of Data Propagation Algorithm for Conflict-Free Replicated Data Type-based Datastores in Geo-Distributed Edge Environment

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Replication primarily provides data availability by having multiple copies over different systems and is exploited to make distributed systems scalable in num- bers and geographical areas. Placing a replica closer to the source of request can also significantly reduce the time required to service the request, improv- ing applications’ performance. However, modifications done at a single copy need to be propagated to all the standing copies to maintain the data’s consis- tency. Over the years, numerous strategies have been proposed for handling the tradeoff between consistency and availability, of which the majority pro- vides either strong consistency or eventual consistency. These models do not provide sufficient compatibility for developing modern applications for geo- distributed (edge) environments.Conflict-Free Replicated Data Types (CRDT) provides a new model of consistency referred to as strong eventual consistency. In principle, CRDTs guarantee conflict-free merge even when the updates arrive out of order using simple mathematical properties. Lasp is a coordination free distributed pro- gramming model for building modern distributed applications using CRDTs. Lasp uses a gossip protocol for disseminating state changes to all replicas in the system. The current implementation of gossip in Lasp is agnostic to the application’s behavior in propagating the updates efficiently to critical repli- cas in the system. In the thesis, we introduce an application-specific feature to optimize the dissemination of updates in Lasp. The proposed algorithm propagates the updates by catering to the different consistency requirements of the replicas in the system. The experimental results on a topology of 100 replicas found that the update latency at critical replicas with high consistency requirements is reduced by 40–50%, and the total bandwidth consumption in the system is reduced by 4–8% without significant repercussion on other repli- cas in the system.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)