Metadata-Aware Measures for Answer Summarization in Community Question Answering

University essay from Institutionen för informationsteknologi

Author: Mattia Tomasoni; [2011]

Keywords: ;

Abstract: My thesis report presents a framework for automatically processing information coming from community Question Answering (cQA) portals. The purpose is that of automatically generating a summary in response to a question posed by a human user in natural language. The goal is to ensure that such answer be as trustful, complete, relevant and succinct as possible. In order to do so, the author exploits the metadata intrinsically present in User Generated Content (UGC) to bias automatic multi-document summarization techniques toward higher quality information. The originality of this work lies in the fact that it adopts a representation of concepts alternative to n-grams, which is the standard choice for text summarization tasks; furthermore it proposes two concept-scoring functions based on the notion of semantic overlap. Experimental results on data drawn from Yahoo! Answers demonstrate the effectiveness of the presented method in terms of ROUGE scores. This shows that the information contained in the best answers voted by users of cQA portals can be successfully complemented by the proposed method.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)