Robust, fault-tolerant majority based key-value data store supporting multiple data consistency

University essay from KTH/Skolan för informations- och kommunikationsteknik (ICT)

Abstract: Web 2.0 has significantly transformed the way how modern society works now-a-days. In today‘s Web, information not only flows top down from the web sites to the readers; but also flows bottom up contributed by mass user. Hugely popular Web 2.0 applications like Wikis, social applications (e.g. Facebook, MySpace), media sharing applications (e.g. YouTube, Flickr), blogging and numerous others generate lots of user generated contents and make heavy use of the underlying storage. Data storage system is the heart of these applications as all user activities are translated to read and write requests and directed to the database for further action. Hence focus is on the storage that serves data to support the applications and its reliable and efficient design is instrumental for applications to perform in line with expectations. Large scale storage systems are being used by popular social networking services like Facebook, MySpace where millions of users‘ data have been stored and fully accessed by these companies. However from users‘ point of view there has been justified concern about user data ownership and lack of control over personal data. For example, on more than one occasions Facebook have exercised its control over users‘ data without respecting users‘ rights to ownership of their own content and manipulated data for its own business interest without users‘ knowledge or consent. The thesis proposes, designs and implements a large scale, robust and fault-tolerant key-value data storage prototype that is peer-to-peer based and intends to back away from the client-server paradigm with a view to relieving the companies from data storage and management responsibilities and letting users control their own personal data. Several read and write APIs (similar to Yahoo!‘s P NUTS but different in terms of underlying design and the environment they are targeted for) with various data consistency guarantees are provided from which a wide range of web applications would be able to choose the APIs according to their data consistency, performance and availability requirements. An analytical comparison is also made against the PNUTS system that targets a more stable environment. For evaluation, simulation has been carried out to test the system availability, scalability and fault-tolerance in a dynamic environment. The results are then analyzed and conclusion is drawn that the system is scalable, available and shows acceptable performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)