Comparing the Performance of Compiled vs Interpreted RegEx

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: The Regular Expression (RegEx) is one of the most important computer science technologies used for searching through text. Used commonly in almost every corner of computer science that is dependent on searching, it is imperative that they are made to be efficient. Usually, RegEx are implemented through the use of a process called interpretation. This thesis explores the possibility and execution time benefits of compiling the RegEx as part of the program instead of interpreting it. For this purpose, a prototype implementation was developed in the Rust programming language. Using this prototype, execution time benchmarks were performed that compare the optimised, and commonly used, interpreted variant against the thesis’ unoptimised compiled version. While the results did not determine a clear preferred method in terms of execution time, they did highlight the potential that exists in compiling RegEx. With some of the tests showing faster execution times in the prototype, there are strong arguments for future research into this field, where the compilation of RegEx can come to benefit from the optimisations present in the interpreted norm.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)