Online aggregate tables : A method forimplementing big data analysis in PostgreSQLusing real time pre-calculations

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Author: Fabian Bergmark; [2017]

Keywords: big data; aggregation; real-time; PostgreSQL;

Abstract: In modern user-centric applications, data gathering and analysis is often of vitalimportance. Current trends in data management software show that traditionalrelational databases fail to keep up with the growing data sets. Outsourcingdata analysis often means data is locked in with a particular service, makingtransitions between analysis systems nearly impossible. This thesis implementsand evaluates a data analysis framework implemented completely within a re-lational database. The framework provides a structure for implementations ofonline algorithms of analytical methods to store precomputed results. The re-sult is an even resource utilization with predictable performance that does notdecrease over time. The system keeps all raw data gathered to allow for futureexportation. A full implementation of the framework is tested based on thecurrent analysis requirements of the company Shortcut Labs, and performancemeasurements show no problem with managing data sets of over a billion datapoints.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)