An unsupervised method for Graph Representation Learning

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Internet services, such as online shopping and chat apps, have been spreading significantly in recent years, generating substantial amounts of data. These data are precious for machine learning and consist of connections between different entities, such as users and items. These connections contain important information essential for ML models to exploit, and the need to extract this information from graphs gives rise to Graph Representation Learning. By training on these data using Graph Representation Learning methods, hidden information can be obtained, and services can be improved. Initially, the models used for Graph Representation Learning were unsupervised, such as the Deepwalk and Node2vec. These models originated from the field of Natural Language Processing. These models are easy to apply, but their performance is not satisfactory. On the other hand, while supervised models like GNN and GCN have better performance than unsupervised models, they require a huge effort to label the data and finetune the model. Nowadays, the datasets have become larger and more complex, which makes the burden heavier for applying these supervised models. A recent breakthrough in the field of Natural Language Processing may solve the problem. In the paper ‘Attention is all you need’, the authors introduce the Transformer model, which shows excellent performance in NLP. Considering that the field of NLP has many things in common with the GRL and the first supervised models all originated from NLP, it is reasonable to guess whether we can take advantage of the Transformer in improving the performance of the unsupervised model in GRL. Generating embedding for nodes in the graph is one of the significant tasks of GRL. In this thesis, the performance of the Transformer model on generating embedding is tested. Three popular datasets (Cora, Citeseer, Pubmed) are used in training, and the embedding quality is measured through node classification with a linear classification algorithm. Another part of the thesis is to finetune the model to determine the effect of model parameters on embedding accuracy. In this part, comparison experiments are conducted on the dimensions, the number of layers, the sample size, and other parameters. The experiments show that the Transformer model performs better in generating embedding than the original methods, such as the Deepwalk. Compared to supervised methods, it requires less finetuning and less training time. The characteristic of the Transformer model revealed from the experiments shows that it is a good alternative to the baseline model for embedding generation. Improvement may be made on the prepossessing and loss function of the model to get higher performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)