Evaluating and optimizing Transformer models for predicting chemical reactions

University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

Abstract: In this thesis, we assess the effectiveness of a transformer model specifically trained to predict chemical reactions. The model, named Chemformer, is a sequence-tosequence model that uses the transformer’s encoder and decoder stacks. Here, we employ a pre-trained Chemformer model to predict single-step retrosynthesis and evaluate its performance for diverse chemical reaction categories using various metrics such as Top-k accuracies, and Tanimoto similarity. We compare and analyse the results of the evaluations to those of the present template-based model. Based on the findings of the analysis, we fine-tuned the Chemformer model for specific chemical reactions, such as Ugi, Suzuki-Coupling, Rearrangement, Diels-Alder and Ring-Forming. In this project, we address five research questions, including whether the Chemformer model has higher accuracy than template-based model, which reactions it performs better and worse on top-k accuracies, the level of diversity in the results, and what fine-tuning strategies should be employed to enhance its performance. Using attention-based explainable AI, we scrutinize the input features that impact the transformation in the produced molecule. The results presented here may be used in the future to design fine-tuning strategies. The evaluation results of pre-trained Chemformer model yields average Top-k accuracies across most of the reaction classes suggesting that the model struggles to accurately predict the reactions on in-house test data. When evaluating the model’s performance on USPTO data, we found similar results. While the results demonstrate that the pre-trained model outperforms the template-based model, there is still potential for enhancing its performance. This potential for further improvement paves the way for the fine-tuning process. By applying fine-tuning to specific sub-tasks such as Ugi, Suzuki-Coupling etc., we managed to significantly enhance the model’s performance. The fine-tuned model consistently outperforms the both pre-trained and template-based models, exhibiting a notable 50% improvement in accuracy over the pre-trained model. This substantial progression reinforces the effectiveness of transfer learning as a powerful approach for enhancing Chemformer models.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)