Using semantic folding with TextRank for automatic summarization

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Author: Simon Karlsson; [2017]

Keywords: automatic summarization; natural language processing; TextRank; semantic folding; extractive document summarization;

Abstract: This master thesis deals with automatic summarization of text and how semantic folding can be used as a similarity measure between sentences in the TextRank algorithm. The method was implemented and compared with two common similarity measures. These two similarity measures were cosine similarity of tf-idf vectors and the number of overlapping terms in two sentences. The three methods were implemented and the linguistic features used in the construction were stop words, part-of-speech filtering and stemming. Five different part-of-speech filters were used, with different mixtures of nouns, verbs, and adjectives. The three methods were evaluated by summarizing documents from the Document Understanding Conference and comparing them to gold-standard summarization created by human judges. Comparison between the system summaries and gold-standard summaries was made with the ROUGE-1 measure. The algorithm with semantic folding performed worst of the three methods, but only 0.0096 worse in F-score than cosine similarity of tf-idf vectors that performed best. For semantic folding, the average precision was 46.2% and recall 45.7% for the best-performing part-of-speech filter.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Using semantic folding with TextRank for automatic summarization

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-22)