A survey on multi-modal summarization

Anubhav Jangra, Adam Jatowt, Sriparna Saha, Mohammad Hasanuzzaman

Research output: Contribution to journalArticlepeer-review

34 Citations (Scopus)

Abstract

The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this article, we present a comprehensive survey of the existing research in the area of MMS, covering various modalities such as text, image, audio, and video. Apart from highlighting the different evaluation metrics and datasets used for the MMS task, our work also discusses the current challenges and future directions in this field.

Original languageEnglish
Article number296
Number of pages36
JournalACM Computing Surveys
Volume55
Issue number13s
DOIs
Publication statusPublished - 13 Jul 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Keywords

  • multi-modal content processing
  • neural networks
  • Summarization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'A survey on multi-modal summarization'. Together they form a unique fingerprint.

Cite this