Performance Evaluation of MMAPv1 and WiredTiger Storage Engines in MongoDB : An Experiment

University essay from Blekinge Tekniska Högskola/Institutionen för datalogi och datorsystemteknik

Abstract: Context. As the data world entered Web 2.0 era, there is loads of structured, semi-structured and unstructured data growing enormously. The structured data can be handled efficiently by SQL databases. But to handle unstructured and semi-structured data, NoSQL databases have been introduced. NoSQL databases can be broadly classified into four types – key-value, column-oriented, document-oriented and graph-oriented. MongoDB is one such NoSQL databases which comes under the category of document-oriented databases. The data in MongoDB is stored using storage engines. MongoDB currently uses two different storage engines– MMAPv1 and WiredTiger. Objectives. This study focuses on presenting a performance evaluation of two data storage engines, MMAPv1 and WiredTiger, emphasizing on certain metrics which will be obtained from the literature review. This thesis aims to show which storage engine is better while using different workloads. Methods. Literature study is done to obtain knowledge on performance evaluation of MongoDB database comparing with other SQL and NoSQL databases. YCSB benchmarking tool has been chosen to evaluate the performance of the storage engines. Later, to show which storage engine is better on different workloads, penalties have been calculated. Results. The literature search resulted in obtaining four metrics – Execution time, Throughput, CPU Utilization and Memory Utilization as the metrics which best comply with presenting the evaluation of two storage engines, MMAPv1 and WiredTiger. The experiment resulted in generation of penalties that indicate which storage engine is better than the other and in which scenarios. Conclusions. MMAPv1 shows better performance when the workloads are Read favorable. On the other hand, WiredTiger shows better performance when the workloads are Write favorable and also when the workloads are neutral (equal amounts of reads and writes).

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)