Transformer-Based Multi-scale Technical Reports Analyser for Science Projects Cost Prediction

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Intrinsic value prediction is a Natural Language Processing (NLP) problem consisting in determining a numerical value contained implicitly and non-trivially in a text. In this project, we introduce the SWORDSMAN model (Sentence and Word-level Oracle for Research Documents by Semantic Multi-scale ANalysis), a deep neural network architecture based on transformers whose goal is to predict the cost of research projects from the analysis of their abstract. SWORDSMAN is built on a hybrid structure based on two branches in order to conduct a multi-scale analysis by combining the strengths of global and local perspectives to extract more relevant information from these texts. The local branch uses Convolution Neural Networks (CNNs) to analyse abstracts at fine-grained word level and bring more nuance to the understanding of the context of occurrence of key terms, while the global branch combines Sentence Transformers and Radial Basis Functions (RBFs) to process these abstracts at a higher level to identify the overall context of the project, while being more focused on the content than the form of the data. The joint use of these models allows SWORDSMAN to have a better capacity to understand complex data by using this analysis at different levels of granularity to present a better estimation accuracy.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)