Multi-threaded execution of Cypher queries

University essay from Lunds universitet/Institutionen för datavetenskap

Abstract: In this report we investigate parallel execution of queries in graph databases. We analyse different methods of parallelization, how to introduce query parallelization to a graph database, which query operations that are suitable for parallelization and if we can improve the execution time of a single query. We do this by designing and implementing a parallel runtime for the Cypher query language in the graph database Neo4j, but many of the design ideas and operators investigated are applicable to any graph database. We focus on increasing performance for a select few operators, while still being fully integrated with Neo4j. We take much inspiration from a design called morsel-driven parallelism. This means that we strive to split the workload into many small pieces, “morsels”, and then hand these morsels to the threads executing the query. This is in contrast to a more classical parallelization approach, where you split the workload into a few big parts of equal size. We conclude that the operators best suited for parallelization are the operators that can be split into several smaller parts, where each part can be computed independently. We successfully introduce parallel execution of Cypher queries to Neo4j and by doing so we increase the performance of a single query by up to 15 times under certain conditions

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)