Abstract
The importance estimation of sentences plays an important role in the extractive summarization of multi-documents. This paper introduces a method to estimate the importance of sentences by using feature engineering. Different from prior studies that usually use local information inside single documents, we define a set of features by combining local indicators inside every single document with global features extracted from relevant documents in the same topic. The combination enables our model to take into account information from two channels for measuring the salient of sentences. The features are used to train a learning-to-rank model for ranking sentences. The summary is finally created by selecting top-ranked sentences with the consideration of diversity. We extensively conduct experiments on five benchmark datasets in two languages—English and Vietnamese. Experimental results indicate that the model achieves promising results on three DUC datasets and is the best on two Vietnamese datasets.
Original language | English |
---|---|
Pages (from-to) | 275-286 |
Number of pages | 12 |
Journal | Progress in Artificial Intelligence |
Volume | 12 |
Issue number | 3 |
Early online date | 19 May 2023 |
DOIs | |
Publication status | Published - Sept 2023 |
Bibliographical note
Funding Information:This work was supported by Ministry of Education and Training, Vietnam, under Grant MOET B2020-SKH-02.
Publisher Copyright:
© 2023, Springer-Verlag GmbH Germany, part of Springer Nature.
Keywords
- Feature extraction
- Learning-to-rank
- Summarization
ASJC Scopus subject areas
- Artificial Intelligence