Exploring connectivity patterns in cancer proteins with machine learning

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Proteins are among the most versatile organic macromolecules essential for living systems and present in almost all biological processes. Cancer is associated with mutations that either enhance or disrupt the conformation of proteins. These mutations have been shown to accumulate in specific regions of a proteins three dimensional structure. In this thesis, the aim is to find connections that secondary structure elements make and explore them using a self-organizing map (SOM). The detection of these connections is done by first mapping the three-dimensional structure onto a novice type of distance matrix that also incorporates chemical information, and then deploying a density-based clustering algorithm. The connections found are mapped onto the SOM and later analyzed in order to see if highly mutated connections are more common among certain SOM-nodes. This was tested with an ANOVA that indicated that there are indeed mutational asymmetries among the nodes. By further analyzing the map it could also be stated that certain nodes were to a large extent activated by connections from genes associated with cancer. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)