Using Hash Trees for Database Schema Inconsistency Detection

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: For this work, two algorithms have been developed to improve the performance of the inconsistency detection by using Merkle trees. The first builds a hash tree from a database schema version, and the second compares two hash trees to find where changes have occurred. The results of performance testing done on the hash tree approach compared to the current approach used by Cisco where all data in the schema is traversed, shows that the hash tree algorithm for inconsistency detection performs significantly better than the complete traversal algorithm in all cases tested, with the exception of when all nodes have changed in the tree. The factor of improvement is directly related to the number of nodes that have to be traversed for the hash tree, which in turn depends on the number of changes done between versions and the positioning in the schema of the nodes that have changed. The real-life example scenarios used for performance testing show that on average, the hash tree algorithm only needs to traverse 1,5% of the number of nodes that the complete traversal algorithm used by Cisco does, and on average gives a 200 times improvement in performance. Even in the worst real-life case used for testing, the hash tree algorithm performed five times better than the complete traversal algorithm.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)