Exemplar-based (or Corpus-based) speech enhancement algorithms have great potential but are typically slow due to needing to search through the entire corpus. The properties of speech can be exploited to improve these algorithms. Firstly, a corpus can be clustered by a phonetic ordering into a search tree which can be used to find a best matching segment. This dramatically reduces the search space, reducing the time complexity of searching a corpus of n segments from O(n) to O(log(n)). Secondly, clustering can be used to give a lossy compression of a speech corpus by replacing original segments with codewords. These techniques are shown in comparison with sequential search and non-compressed corpora using a simple speech enhancement algorithm. A combination of these techniques for a corpus of a quarter of WSJO results in a speedup of approximately 3000×.
|Title of host publication||2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Pages||5419 - 5423|
|Publication status||Published - 13 Sep 2018|
Nesbitt, D., Crookes, D., & Ji, M. (2018). Speech Segment Clustering for Real-Time Exemplar-Based Speech Enhancement. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5419 - 5423). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/ICASSP.2018.8461689