Learning to summarize multi-documents with local and global information

Van Hau Nguyen, Son T. Mai, Minh Tien Nguyen*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

The importance estimation of sentences plays an important role in the extractive summarization of multi-documents. This paper introduces a method to estimate the importance of sentences by using feature engineering. Different from prior studies that usually use local information inside single documents, we define a set of features by combining local indicators inside every single document with global features extracted from relevant documents in the same topic. The combination enables our model to take into account information from two channels for measuring the salient of sentences. The features are used to train a learning-to-rank model for ranking sentences. The summary is finally created by selecting top-ranked sentences with the consideration of diversity. We extensively conduct experiments on five benchmark datasets in two languages—English and Vietnamese. Experimental results indicate that the model achieves promising results on three DUC datasets and is the best on two Vietnamese datasets.

Original languageEnglish
Pages (from-to)275-286
Number of pages12
JournalProgress in Artificial Intelligence
Volume12
Issue number3
Early online date19 May 2023
DOIs
Publication statusPublished - Sept 2023

Bibliographical note

Funding Information:
This work was supported by Ministry of Education and Training, Vietnam, under Grant MOET B2020-SKH-02.

Publisher Copyright:
© 2023, Springer-Verlag GmbH Germany, part of Springer Nature.

Keywords

  • Feature extraction
  • Learning-to-rank
  • Summarization

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning to summarize multi-documents with local and global information'. Together they form a unique fingerprint.

Cite this