Privacy-Preserving Sharing of Health Data using Hybrid Anonymisation Techniques : A Comparison
Abstract: Data anonymisation is not a trivial task due to the challenge of balancing the trade-off between anonymity and data utility. A fairly new attempt to address this challenge is the development of hybrid anonymisation algorithms a combination of syntactic privacy models, often k-anonymity, and differential privacy. However, the complexity of evaluating the performance of anonymisation algorithms makes it difficult to draw conclusions of their performance in contrast to one another. To be able to use the algorithms in practice it is important to understand the differences between different algorithms and their strength and weaknesses in different settings.This project addressed this by comparing two recently proposed hybrid anonymisation algorithms, MDP and SafePub, to study their applicability on medical datasets. The algorithms were applied on different datasets, among them a medical dataset from the wild. The resulting performance was based on the information loss and disclosure risk for the anonymised datasets. While MDP had less information loss for stronger privacy guarantees, it is less suitable for medical datasets since the datasets are anonymised under the assumption that all attributes in the dataset are independent. SafePub on the other hand, while keeping the attribute dependencies intact, had a substantial information loss for stronger privacy levels. Therefore, which algorithm that is best suitable depends on the dataset characteristics, the required privacy level and the acceptable information loss. It is of course possible that neither of the models are suitable for a specific use case. Also, to conclude a general performance for the algorithms on medical datasets, more tests are needed.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)