A Cross-modal Approach for Karaoke Artifacts Correction

WeiQi Yan

Research output: Contribution to journalArticle

1 Citation (Scopus)


Karaoke singing is a popular form of entertainment in several parts of the world. Since this genre of performance attracts amateurs, the singing often has artifacts related to scale, tempo, and synchrony. We have developed an approach to correct these artifacts using cross-modal multimedia streams information. We first perform adaptive sampling on the user's rendition and then use the original singer's rendition as well as the video caption highlighting information in order to correct the pitch, tempo and the loudness. A method of analogies has been employed to perform this correction. The basic idea is to manipulate the user's rendition in a manner to make it as similar as possible to the original singing. A pre-processing step of noise removal due to feedback and huffing also helps improve the quality of the user's audio. The results are described in the paper which shows the effectiveness of this multimedia approach.
Original languageEnglish
Pages (from-to)413-439
Number of pages27
JournalMultimedia Tools and Applications
Issue number3
Publication statusPublished - Sep 2008

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Graphics and Computer-Aided Design
  • Information Systems
  • Software
  • Electrical and Electronic Engineering
  • Theoretical Computer Science

Fingerprint Dive into the research topics of 'A Cross-modal Approach for Karaoke Artifacts Correction'. Together they form a unique fingerprint.

Cite this