Hidden Conditional Random Fields for Visual Speech Recognition

A. Pass, Jianguo Zhang, Darryl Stewart

Research output: Contribution to conferencePaperpeer-review

1 Citation (Scopus)


In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.
Original languageEnglish
Number of pages6
Publication statusPublished - Sep 2009
EventIrish Machine Vision and Image Processing conference - , Ireland
Duration: 01 Sep 200901 Sep 2009


ConferenceIrish Machine Vision and Image Processing conference


Dive into the research topics of 'Hidden Conditional Random Fields for Visual Speech Recognition'. Together they form a unique fingerprint.

Cite this