Representation learning for single cell morphological phenotyping

University essay from Umeå universitet/Institutionen för fysik

Author: Andreas Nenner; [2022]

Keywords: Deep learning; Image-based profiling;

Abstract: Preclinical research for developing new drugs is a long and expensive procedure. Experiments relying on image acquisition and analysis tend to be low throughput and use reporter systems that may influence the studied cells. With image-based assays focusing on extracting qualitative information from microscopic images of mammalian cells, more cost-efficient and high-throughput analyses are possible. Furthermore, studying cell morphology has proven to be a good indicator of cell phenotype. Using hand-crafted feature descriptors based on cell morphology, label-free quantification of cell apoptosis has been achieved. These hand-crafted descriptors are based on cell characteristics translated to quantifiable metrics, but risk being biased towards easily observable features and therefore miss subtle ones.         This project proposes an alternative approach by generating a latent representation of cell features using deep learning models and aims to find if they can compete with pre-defined hand-crafted representations in classifying live or dead cells. For this purpose, three deep learning models are implemented, one autoencoder and two variational-autoencoder. We develop a core architecture shared between the models based on a convolutional neural network using a latent space with 16 dimensions. We then train the models to recreate single-cell images of SKOV3 ovarian cancer cells. The latent representation was extracted at specific checkpoints during training and later used for training a logistic regression classifier. Finally, comparing classification accuracy between the hand-crafted feature representations and generated representation was made with novel cell images. The generated representations show a slight but consistent increase in classification accuracy, up to 4.9 percent points, even without capturing all morphological details in the recreation. Thus, we conclude that it is possible for generated representations to outperform hand-crafted feature descriptors in live or dead cell classification. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)