A Cross-modal Approach for Karaoke Artifacts Correction

WeiQi Yan

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Karaoke singing is a popular form of entertainment in several parts of the world. Since this genre of performance attracts amateurs, the singing often has artifacts related to scale, tempo, and synchrony. We have developed an approach to correct these artifacts using cross-modal multimedia streams information. We first perform adaptive sampling on the user's rendition and then use the original singer's rendition as well as the video caption highlighting information in order to correct the pitch, tempo and the loudness. A method of analogies has been employed to perform this correction. The basic idea is to manipulate the user's rendition in a manner to make it as similar as possible to the original singing. A pre-processing step of noise removal due to feedback and huffing also helps improve the quality of the user's audio. The results are described in the paper which shows the effectiveness of this multimedia approach.
Original languageEnglish
Pages (from-to)413-439
Number of pages27
JournalMultimedia Tools and Applications
Volume39
Issue number3
DOIs
Publication statusPublished - Sep 2008

Fingerprint

Sampling
Feedback
Multimedia
Processing
Adaptive Sampling
Noise Removal
Synchrony
Preprocessing
Analogy
Form

Cite this

@article{17560676de95432c8c9a06db37957f62,
title = "A Cross-modal Approach for Karaoke Artifacts Correction",
abstract = "Karaoke singing is a popular form of entertainment in several parts of the world. Since this genre of performance attracts amateurs, the singing often has artifacts related to scale, tempo, and synchrony. We have developed an approach to correct these artifacts using cross-modal multimedia streams information. We first perform adaptive sampling on the user's rendition and then use the original singer's rendition as well as the video caption highlighting information in order to correct the pitch, tempo and the loudness. A method of analogies has been employed to perform this correction. The basic idea is to manipulate the user's rendition in a manner to make it as similar as possible to the original singing. A pre-processing step of noise removal due to feedback and huffing also helps improve the quality of the user's audio. The results are described in the paper which shows the effectiveness of this multimedia approach.",
author = "WeiQi Yan",
year = "2008",
month = "9",
doi = "10.1007/s11042-007-0174-z",
language = "English",
volume = "39",
pages = "413--439",
journal = "Multimedia Tools and Applications",
issn = "1380-7501",
publisher = "Springer Netherlands",
number = "3",

}

A Cross-modal Approach for Karaoke Artifacts Correction. / Yan, WeiQi.

In: Multimedia Tools and Applications, Vol. 39, No. 3, 09.2008, p. 413-439.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A Cross-modal Approach for Karaoke Artifacts Correction

AU - Yan, WeiQi

PY - 2008/9

Y1 - 2008/9

N2 - Karaoke singing is a popular form of entertainment in several parts of the world. Since this genre of performance attracts amateurs, the singing often has artifacts related to scale, tempo, and synchrony. We have developed an approach to correct these artifacts using cross-modal multimedia streams information. We first perform adaptive sampling on the user's rendition and then use the original singer's rendition as well as the video caption highlighting information in order to correct the pitch, tempo and the loudness. A method of analogies has been employed to perform this correction. The basic idea is to manipulate the user's rendition in a manner to make it as similar as possible to the original singing. A pre-processing step of noise removal due to feedback and huffing also helps improve the quality of the user's audio. The results are described in the paper which shows the effectiveness of this multimedia approach.

AB - Karaoke singing is a popular form of entertainment in several parts of the world. Since this genre of performance attracts amateurs, the singing often has artifacts related to scale, tempo, and synchrony. We have developed an approach to correct these artifacts using cross-modal multimedia streams information. We first perform adaptive sampling on the user's rendition and then use the original singer's rendition as well as the video caption highlighting information in order to correct the pitch, tempo and the loudness. A method of analogies has been employed to perform this correction. The basic idea is to manipulate the user's rendition in a manner to make it as similar as possible to the original singing. A pre-processing step of noise removal due to feedback and huffing also helps improve the quality of the user's audio. The results are described in the paper which shows the effectiveness of this multimedia approach.

UR - http://www.scopus.com/inward/record.url?scp=47949092014&partnerID=8YFLogxK

U2 - 10.1007/s11042-007-0174-z

DO - 10.1007/s11042-007-0174-z

M3 - Article

VL - 39

SP - 413

EP - 439

JO - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

SN - 1380-7501

IS - 3

ER -