Applicability of GPT models to high-performance compute languages

University essay from Uppsala universitet/Högenergifysik

Abstract: This thesis aims to investigate the feasibility of generating code in high-performance computing languages such as C++ with neural networks. This has been investigated by transfer learning publicly available pretrained transformers on C++ code. The models chosen for transfer learning are CodeT5, an encode-decoder model with 770 million parameters, and two decoder-only models called CodeGen, one with 350 million parameters and the other having one billion parameters. All models were trained on a labeled dataset where each sample had a prompt in natural language and an answer in C++ code. However, the CodeT5 model was also trained on an unlabelled dataset of C++ code since that model did not come pretrained on C++ code. The models were evaluated using the CodeBERTScore, which measures the cosine similarity of model-generated code with the reference code. The CodeT5 model gave the best score. However, looking at the types of programming tasks the model solved, the results indicate that they can only solve trivial programming tasks. This is likely due to the training corpus size and the models' size. Nevertheless, due to the limitations of computing resources available during the thesis, training larger models on a more extensive training corpus, specially labeled data, was not feasible, which would have given a performance gain. Additional computing resources would be required to train larger models on larger datasets to improve performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)