Emotion Recognition on large video dataset based on Convolutional Feature Extractor and Recurrent Neural Network

Denis Rangulov, Muhammad Fahim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

For many years, the emotion recognition task has remained one of the most interesting and important problems in the field of human-computer interaction. In this study, we consider the emotion recognition task as a classification as well as a regression task by processing encoded emotions in different datasets using deep learning models. Our model combines a convolutional neural network (CNN) with recurrent neural network (RNN) to predict dimensional emotions on video data. In the first step, CNN extracts feature vectors from video frames. In the second step, we fed these feature vectors to train RNN for exploiting the temporal dynamics of video. Furthermore, we analyzed how each neural network contributes to the sys-Tem's overall performance. The experiments are performed on publicly available datasets including the largest modern Aff-Wild2 database. It contains over sixty hours of video data. We discovered the problem of overfitting of the model on an unbalanced dataset with an illustrative example using confusion matrices. The problem is solved by downsampling technique to balance the dataset. By significantly decreasing training data, we balance the dataset, thereby, the overall performance of the model is improved. Hence, the study qualitatively describes the abilities of deep learning models exploring enough amount of data to predict facial emotions. Our proposed method is implemented using Tensorflow Keras. The code is publicly available in the repository11https://github.com/DenisRang/Combined-CNN-RNN-for-emotion-recognition.

Original languageEnglish
Title of host publication4th International Conference on Image Processing, Applications and Systems (IPAS 2020): Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages14-20
Number of pages7
ISBN (Electronic)9781728175744
DOIs
Publication statusPublished - 01 Feb 2021
Externally publishedYes
Event4th IEEE International Conference on Image Processing, Applications and Systems, IPAS 2020 - Virtual, Genova, Italy
Duration: 09 Dec 202011 Dec 2020

Publication series

Name4th International Conference on Image Processing, Applications and Systems, IPAS 2020

Conference

Conference4th IEEE International Conference on Image Processing, Applications and Systems, IPAS 2020
Country/TerritoryItaly
CityVirtual, Genova
Period09/12/202011/12/2020

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Keywords

  • Aff-Wild2
  • affective behavior analysis
  • arousal
  • CNN
  • deep neural network
  • emotion
  • RNN
  • valence

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Emotion Recognition on large video dataset based on Convolutional Feature Extractor and Recurrent Neural Network'. Together they form a unique fingerprint.

Cite this