On-line Emotion Recognition in a 3-D Activation-Valence-Time Continuum using Acoustic and Linguistic Cues

F. Eyben, M. Wollmer, A. Graves, B. Schuller, Ellen Douglas-Cowie, Roddy Cowie

Research output: Contribution to journalArticlepeer-review

81 Citations (Scopus)
85 Downloads (Pure)

Abstract

For many applications of emotion recognition, such as virtual agents, the system must select responses while the user is speaking. This requires reliable on-line recognition of the user’s affect. However most emotion recognition systems are based on turnwise processing. We present a novel approach to on-line emotion recognition from speech using Long Short-Term Memory Recurrent Neural Networks. Emotion is recognised frame-wise in a two-dimensional valence-activation continuum. In contrast to current state-of-the-art approaches, recognition is performed on low-level signal frames, similar to those used for speech recognition. No statistical functionals are applied to low-level feature contours. Framing at a higher level is therefore unnecessary and regression outputs can be produced in real-time for every low-level input frame. We also investigate the benefits of including linguistic features on the signal frame level obtained by a keyword spotter.
Original languageEnglish
Pages (from-to)7-19
Number of pages13
JournalJournal on Multimodal User Interfaces
Volume3
Issue number1-2
DOIs
Publication statusPublished - Mar 2010

Keywords

  • emotion recognition
  • databases
  • acoustic and linguistic cues

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'On-line Emotion Recognition in a 3-D Activation-Valence-Time Continuum using Acoustic and Linguistic Cues'. Together they form a unique fingerprint.

Cite this