Towards the creation of a Clinical Summarizer

University essay from Linköpings universitet/Institutionen för datavetenskap

Abstract: While Electronic Medical Records provide extensive information about patients, the vast amounts of data cause issues in attempts to quickly retrieve valuable information needed to make accurate assumptions and decisions directly concerned with patients’ health. This search process is naturally time-consuming and forces health professionals to focus on a labor intensive task that diverts their attention from the main task of applying their knowledge to save lives. With the general aim of potentially relieving the professionals from this task of finding information needed for an operational decision, this thesis explores the use of a general BERT model for extractive summarization of Swedish medical records to investigate its capability in extracting sentences that convey important information to MRI physicists. To achieve this, a domain expert evaluation of medical histories was performed, creating the references summaries that were used for model evaluation. Three implementations are included in this study and one of which is TextRank, a prominent unsupervised approach to extractive summarization. The other two are based on clustering and rely on BERT to encode the text. The implementations are then evaluated using ROUGE metrics. The results support the use of a general BERT model for extractive summarization on medical records. Furthermore, the results are discussed in relation to the collected reference summaries, leading to a discussion about potential improvements to be made with regards to the domain expert evaluation, as well as the possibilities for future work on the topic of summarization of clinical documents.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)