Comparison of Support Vector Machines and Deep Learning For QSAR with Conformal Prediction

University essay from Uppsala universitet/Institutionen för farmaceutisk biovetenskap

Abstract: Quantitative Structure Activity Relationship (QSAR) is a very useful computa-tional method which has facilitated great progress in drug development [1]. Thismethod can be used to predict a molecule’s activity against a certain target justby comparing its structural characteristics (i.e., molecular descriptors) with thosebelonging to molecules of known activity. QSAR modeling is fueled by online freedatabases consisting of millions of active and inactive molecules and by MachineLearning (ML) Methods that enable data analysis. To ensure successful implemen-tation of ML models, there is a range of evaluation methods to estimate their perfor-mance and applicability domain. So far, a great deal of research has focused on theuse of Support Vector Machines (SVMs) to classify molecules with the use of theirMolecular Signature Fingerprints as descriptors [2]. However, another MachineLearning algorithm, Deep Neural Networks (DNNs), an improvement of single-layer Neural Networks, is rising in popularity in various fields including moleculeclassification. The two models were compared using CPSign software which intro-duces Conformal Prediction, to evaluate the reliability of model predictions basedon performance for individual compounds rather than mean performance on agiven test set. Three types of descriptors were used: Molecular Signature Finger-prints, Extended Connectivity Fingerprints and physicochemical descriptors. Thecomparison showed that Multilayer Perceptron (MLP) which was used as a DNNrepresentative in current context, had performance similar to the shallower SVMmodels but additionally demanded longer training times [3]. It can be concludedthat in the field of QSAR with the aforementioned descriptors, when the numberof examples used for training is not immense, Support Vector Machines might per-form equally well and demand less resources and time than the more sophisticated MLPs.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)