Garbage Collected CRDTs on the Web : Studying the Memory Efficiency of CRDTs in a Web Context

University essay from Uppsala universitet/Institutionen för informationsteknologi

Abstract: In today's connected society, where it is common to have several connected devices per capita, it is more important than ever that the data you need is omnipresent, i.e. its available when you need it, no matter where you are. We identify one key technology and platform that could be the future—peer-to-peer communication and the Web. Unfortunately, guaranteeing consistency and availability between users in a peer-to-peer network, where network partitions are bound to happen, can be a challenging problem to solve. To solve these problems, we turned to a promising category of data types called CRDTs—Conflict Free Replicated Data Types. By following the scientific tradition of reproduction, we build upon previous research of a CRDT framework, and adjust it work in a peer-to-peer Web environment, i.e. it runs on a Web browser. CRDTs makes use of meta-data to ensure consistency, and it is imperative to remove this meta-data once it no longer has any use—if not, memory usage grows unboundedly making the CRDT impractical for real-world use. There are different garbage collection techniques that can be applied to remove this meta-data. To investigate whether the CRDT framework and the different garbage collection techniques are suitable for the Web, we try to reproduce previous findings by running our implementation through a series of benchmarks. We test whether our implementation works correctly on the Web, as well as comparing the memory efficiency between different garbage collection techniques. In doing this, we also proved the correctness of one of these techniques. The results from our experiments showed that the CRDT framework was well-adjusted to the Web environment and worked correctly. However, while we could observe similar behaviour between different garbage collection techniques as previous research, we achieved lower relative memory savings than expected. An additional insight was that for long-running systems that often reset its shared state, it might be more efficient to not apply any garbage collection technique at all. There is still much work to be done to allow for omnipresent data on the Web, but we believe that this research contains two main takeaways. The first is that the general CRDT framework is well-suited for the Web and that it in practice might be more efficient to choose different garbage collection techniques, depending on your use-case. The second take-away is that by reproducing previous research, we can still advance the current state of the field and generate novel knowledge—indeed, by combining previous ideas in a novel environment, we are now one step closer to a future with omnipresent data.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)