Self-supervised representation learning from electrocardiogram data for medical applications

University essay from Lunds universitet/Matematik LTH

Abstract: Cardiovascular diseases are the leading cause of death worldwide, increasing yearly. However, many abnormalities in heart cycles can be discovered and treated years before the onset of diseases. But in most societies, regular health checkups are a concept reserved for cars, not humans. In order to save lives, our healthcare systems must adopt a preventative rather than a reactive approach. To that end, there have been several attempts to produce automated ECG-based heartbeat classification methods over the last few decades. But their performance is hindered by limited access to high-quality labeled data, restricting their usage to secondary diagnostic purposes. In this regard, a self-supervised learning framework could provide a viable solution, as it decouples deep learning progress from the dependence on large volumes of annotated data, and instead uses unlabelled samples. In this thesis, we present an assessment of self-supervised representation learning on 12-lead clinical ECG data to examine whether self-supervised learning methods can be applied to electrocardiogram signals to produce meaningful feature representations from only unlabelled data. We implement the self-supervised learning methods SimCLR, BYOL, and VICReg and compare their performances to a supervised learning method. In doing so, we find that self-supervised learning produces meaningful representations of ECG signals. When following each method’s recommended implementation protocol, the performance equals those of a conventional supervised model, initially suggesting that self-supervised pre-training offers no additional benefits to downstream tasks. However, by increasing the length of the ECG signal and adjusting the data augmentation strategy, self-supervised pre-trained models outperformed their supervised counterparts in all evaluation settings. In light of our experiments, we find that a suitable augmentation protocol is crucial for high downstream classification performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)