A topic model-based approach for ontology extension in the computational materials science domain

University essay from Linköpings universitet/Institutionen för datavetenskap

Author: Tong Zhang; [2020]

Keywords: ;

Abstract: With the continuous development and progress of human society, the demand for advanced materials in all walks of life is increasing day by day. No matter in the agrarian age or the information age, human beings have always been tireless in the study of materials science, and the field of computational materials science has been the exploration of computational methods in materials science. However, with the deepening of the research, the scale of research data related to materials science is getting larger and larger, and each research institution establishes their own material information management system. The diversity of the materials data structure and storage form causes the fuzziness of the data structure and the complexity of the integrated data. In order to make data findable and reusable, scientists introduce the concept of ontology in philosophy to generalize the context and structure of data. An ontology is mainly by the field representative extremely, including meaningful concepts and the relationship between concepts. There are a few ontologies found in the computational materials science domain, called Materials Design Ontology (MDO). This thesis mined the representative concepts and relations to extend the MDO. In order to achieve this goal, an improved Topmine framework was deployed, containing a new frequent phrase mining algorithm and an improved phrase-based Latent Dirichlet Allocation (LDA) topic model. The improved Topmine framework introduced the Part-of-Speech Tagging and defined weighted coefficients. The time and space complexity had been reduced from quadratic to linear. And the perplexity of the phrase-based LDA was reduced 26.7%, which means the results are more concentrated and accurate. Meanwhile, the concept lattice is constructed with the idea of formal concept analysis to extend the relations of the domain ontology. In brief, this paper studied the titles and abstracts of more than 9000 pieces of field literature collected to extend MDO, and demonstrate the practicability and practicality of this framework by comparing the experimental results with the existing algorithms.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)