Open Access Open Access  Restricted Access Subscription Access

Improving Triangle-Graph Based Text Summarization using Hybrid Similarity Function


Affiliations
1 Faculty of Computing, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia
 

Objective: Extractive Summarization, extracts the most applicable sentences from the main document, while keeping the most vital information in the document. The Graph-based techniques have become very popular for text summarisation. This paper introduces a hybrid graph based technique for single-document extractive summarization. Methods/Statistical Analysis: Prior research that utilised the graph-based approach for extractive summarisation deployed one function for computing the necessary summary. Nonetheless, in our work, we have recommended an innovative hybrid similarity function (H), for estimation purpose. This function hybridises four distinct similarity measures: cosine similarity (sim1), Jaccard similarity (sim2), word alignmentbased similarity (sim3) and the window-based similarity measure (sim4). The method uses a trainable summarizer, which takes into account several features. The effect of these features on the summarization task is investigated. Findings: By combining, the traditional similarity measures (Cosine and Jaccard) with dynamic programming approaches (word alignment-based and the window-based) for calculating the similarity between two sentences, more common information were extracted and helped to find the best sentences to be extracted in the final summary. The proposed method was evaluated using ROUGE measures on the dataset DUC2002. The experimental results showed that specific combinations of features could give higher efficiency. It also showed that some features have more effect than others on the summary creation. Applications/Improvements: The performance of this new method has been tested using the DUC 2002 data set. The effectiveness of this technique is measured using the ROUGE score, and the results are promising when compared with some existing techniques.

Keywords

Extractive Summarization, Feature Extraction, Graph-Based Summarization, Hybrid Similarity, Sentence Similarity, Triangle Counting
User

Abstract Views: 169

PDF Views: 0




  • Improving Triangle-Graph Based Text Summarization using Hybrid Similarity Function

Abstract Views: 169  |  PDF Views: 0

Authors

Yazan Alaya AL-Khassawneh
Faculty of Computing, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia
Naomie Salim
Faculty of Computing, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia
Mutasem Jarrah
Faculty of Computing, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia

Abstract


Objective: Extractive Summarization, extracts the most applicable sentences from the main document, while keeping the most vital information in the document. The Graph-based techniques have become very popular for text summarisation. This paper introduces a hybrid graph based technique for single-document extractive summarization. Methods/Statistical Analysis: Prior research that utilised the graph-based approach for extractive summarisation deployed one function for computing the necessary summary. Nonetheless, in our work, we have recommended an innovative hybrid similarity function (H), for estimation purpose. This function hybridises four distinct similarity measures: cosine similarity (sim1), Jaccard similarity (sim2), word alignmentbased similarity (sim3) and the window-based similarity measure (sim4). The method uses a trainable summarizer, which takes into account several features. The effect of these features on the summarization task is investigated. Findings: By combining, the traditional similarity measures (Cosine and Jaccard) with dynamic programming approaches (word alignment-based and the window-based) for calculating the similarity between two sentences, more common information were extracted and helped to find the best sentences to be extracted in the final summary. The proposed method was evaluated using ROUGE measures on the dataset DUC2002. The experimental results showed that specific combinations of features could give higher efficiency. It also showed that some features have more effect than others on the summary creation. Applications/Improvements: The performance of this new method has been tested using the DUC 2002 data set. The effectiveness of this technique is measured using the ROUGE score, and the results are promising when compared with some existing techniques.

Keywords


Extractive Summarization, Feature Extraction, Graph-Based Summarization, Hybrid Similarity, Sentence Similarity, Triangle Counting



DOI: https://doi.org/10.17485/ijst%2F2017%2Fv10i8%2F151180