Inter-Frame Contextual Modelling For Visual Speech Recognition

Adrian Pass, Ming Ji, Philip Hanna, Jianguo Zhang, Darryl Stewart

Research output: Contribution to conferencePaper

1 Citation (Scopus)
186 Downloads (Pure)

Abstract

In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.
Original languageEnglish
Pages1-4
Number of pages4
Publication statusPublished - Sep 2010
EventInternational Conference on Image Processing - Hong Kong, Hong Kong
Duration: 26 Sep 201029 Sep 2010

Conference

ConferenceInternational Conference on Image Processing
CountryHong Kong
CityHong Kong
Period26/09/201029/09/2010

Fingerprint Dive into the research topics of 'Inter-Frame Contextual Modelling For Visual Speech Recognition'. Together they form a unique fingerprint.

Cite this