Scheduling workflows to optimize for execution time

University essay from Uppsala universitet/Institutionen för informatik och media

Abstract: Many functions in today’s society are immensely dependent on data. Data drives everything from business decisions to self-driving cars to intelligent home assistants like Amazon Echo and Google Home. To make good decisions based on data, of which exabytes are generated every day, somehow that data has to be processed. Data processing can be complex and time-consuming. One way of reducing the complexity is to create workflows that consist of several steps that together produce the right result. Klarna is an example of a company that relies on workflows for transforming and analyzing data. As a company whose core business involves analyzing customer data, being able to do those analyses faster will lead to direct business value in the form of more well-informed decisions. The workflows Klarna use are currently all written in a sequential form. However, workflows, where independent tasks are executed in parallel, are more performant than workflows where only one task is executed at any point in time. Due to limitations in human attention span, parallelized workflows are harder for humans to write, compared to sequential workflows. In this work, a computer application was created that automates the parallelization of a workflow to let humans write sequential workflows while still getting the performance of parallelized workflows. The application does this by taking a simple sequential workflow, identifies dependencies in the workflow and then schedules it in a way that is as parallel as possible given the identified dependencies. Such a solution has not been created before. However, experimental evaluation shows that parallelization of a sequential workflow used in daily production at Klarna can reduce execution time by up to 80%, showing that the application can bring value to Klarna and other organizations that use workflows to analyze big data.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)