Machine learning and rule induction in invoice processing : Comparing machine learning methods in their ability to assign account codes in the bookkeeping process

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Johan Bergdorf; [2018]

Keywords: ;

Abstract: Companies with more than 3 million SEK in revenue per year are by law in Sweden required to bookkeep invoices as soon as the invoice arrives after a purchase. One part in this bookkeeping process is to choose which accounts to be credited for every received invoice. This is a time-consuming process which demands to find the right account codes for every invoice depending on a number of factors. This thesis investigates how well machine learning can manage this process. Specifically, it is investigated how well machine learning methods that produce unordered rule sets can classify invoice data for prediction of account codes. These rule induction methods are compared to two other popular and well-tested machine learning methods that do not necessarily produce rules for interpretation and knowledge discovery as well as two naive classifiers for baseline comparisons. The results show that naive classifiers are strong but that the machine learning methods perform better when it comes to accuracy and F2score. The results also show that the rule induction method, FURIA, produces significantly less number of rules than MODLEM. The non-rule induction method Random forest has a tendency to perform best overall when it comes to given performance metrics.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)