Integrating dictionaries in a column-oriented database.

University essay from Umeå universitet/Institutionen för datavetenskap

Abstract: In today's data-driven world, managing large volumes of data has become a common challenge. Data-driven businesses often face the task of effectively handling and analysing such extensive datasets when real-time analysis plays a crucial role to make informed decisions. Column-oriented databases have risen in popularity as a preferred storage and analytics solution. Elisa Polystar, for instance, uses ClickHouse, a column-oriented database to provide network and service assurance solutions in their Kalix product. One of the advantages of using column-oriented databases, including ClickHouse, is the availability of compression techniques. Dictionary is an in-memory key-value structure which can be stored completely or partially in RAM and can be used in queries. This thesis conducts a series of query-based experiments to evaluate the performance of Kalix when utilising dictionary. Results show that compared to the traditional left outer join, the dictionary version performed significantly better in five queries for both query duration and memory usage. At its best, the dictionary performs 26 times faster and consumes 1526 times less memory.   

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)