Abstract
This paper presents a new approach to single-channel speech
enhancement involving both noise and channel distortion (i.e.,
convolutional noise). The approach is based on finding longest
matching segments (LMS) from a corpus of clean, wideband
speech. The approach adds three novel developments to our
previous LMS research. First, we address the problem of channel
distortion as well as additive noise. Second, we present
an improved method for modeling noise. Third, we present
an iterative algorithm for improved speech estimates. In experiments
using speech recognition as a test with the Aurora
4 database, the use of our enhancement approach as a preprocessor
for feature extraction significantly improved the performance
of a baseline recognition system. In another comparison
against conventional enhancement algorithms, both the PESQ
and the segmental SNR ratings of the LMS algorithm were superior
to the other methods for noisy speech enhancement.
Index Terms: corpus-based speech model, longest matching
segment, speech enhancement, speech recognition
Original language | English |
---|---|
Title of host publication | Interspeech 2014: Proceedings |
Pages | 2710-2714 |
Number of pages | 5 |
Publication status | Published - 2014 |