Full-Sentence Correlation: a Method to Handle Unpredictable Noise for Robust Speech Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

34 Downloads (Pure)

Abstract

We describe the theory and implementation of full-sentence speech correlation for speech recognition, and demonstrate its superior robustness to unseen/untrained noise. For the Aurora 2 data, trained with only clean speech, the new method performs competitively against the state-of-the-art with multicondition training and adaptation, and achieves the lowest word error rate in very low SNR (-5 dB). Further experiments with highly nonstationary noise (pop song, broadcast news, etc.) show the surprising ability of the new method to handle unpredictable noise.The new method adds several novel developments to our previous research, including the modeling of the speaker characteristics along with other acoustic and semantic features of speech for separating speech from noise, and a novel Viterbi algorithm to implement full-sentence correlation for speech recognition.Index Terms: speech recognition, noise robustness, full sentence correlation, unseen/unpredictable noise
Original languageEnglish
Title of host publicationInterspeech 2019: Proceedings
Pages436-440
Number of pages5
Publication statusPublished - Sep 2019

Fingerprint

Dive into the research topics of 'Full-Sentence Correlation: a Method to Handle Unpredictable Noise for Robust Speech Recognition'. Together they form a unique fingerprint.

Cite this