Improving Transformer-Based Molecular Optimization Using Reinforcement Learning

University essay from Uppsala universitet/Institutionen för lingvistik och filologi

Abstract: By formulating the task of property-based molecular optimization into a neural machine translation problem, researchers have been able to apply the Transformer model from the field of natural language processing to generate molecules with desirable properties by making a small modification to a given starting molecule. These results verify the capability of Transformer models in capturing the connection between properties and structural changes in molecular pairs. However, the current research only proposes a Transformer model with fixed parameters that can produce limit amount of optimized molecules. Additionally, the trained Transformer model does not always successfully generate optimized output for every molecule and desirable property constraint given. In order to push the Transformer model into real applications where different sets of desirable property constraints in combination of variety of molecules might need to be optimized, these obstacles need to be overcome first. In this work, we present a framework using reinforcement learning as a fine-tuning method for the pre-trained Transformer to induce various output and leverage the prior knowledge of the model for a challenging data point. Our results show that, based on the definition of the scoring function, the Transformer model can generate much larger numbers of optimized molecules for a data point that is considered challenging to the pre-trained model. Meanwhile, we also showcase the relation between the sampling size and the efficiency of the framework in yielding desirable outputs to demonstrate the optimal configuration for future users. Furthermore, we have chemists to inspect the generated molecules and find that the reinforcement learning fine-tuning causes the catastrophic forgetting problem that leads our model into generating unstable molecules. Through maintaining the prior knowledge or applying rule-based scoring component, we demonstrate two strategies that can successfully reduce the effect of catastrophic forgetting as a reference for future research.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)