Technical Term Extraction Using Measures of Neology
Abstract: This study aims to show that frequency of occurrence over time for technical terms differs from general language terms in the sense that technical terms are strongly biased to be recent occurrences, and that this difference can be exploited for the automatic identification and extraction of technical terms from text. To this end, we propose two features extracted from temporally labelled datasets designed to capture surface level n-gram neology. The analysis shows that these features, calculated over consecutive bigrams, are highly indicative of technical terms, which suggests that technical terms are strongly biased to be surface level neologisms. Finally, we implement a technical term extractor using the proposed features and compare its performance against a number of baselines.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)