Facial expression recognition is a topical task. However, very little research investigates subtle expression recognition, which is important for mental activity analysis, deception detection, etc. We address subtle expression recognition through convolutional neural networks (CNNs) by developing multi-task learning (MTL) methods to effectively leverage a side task: facial landmark detection. Existing MTL methods follow a design pattern of shared bottom CNN layers and task-specific top layers. However, the sharing architecture is usually heuristically chosen, as it is difficult to decide which layers should be shared. Our approach is composed of (1) a novel MTL framework that automatically learns which layers to share through optimisation under tensor trace norm regularisation and (2) an invariant representation learning approach that allows the CNN to leverage tasks defined on disjoint datasets without suffering from dataset distribution shift. To advance subtle expression recognition, we contribute a Large-scale Subtle Emotions and Mental States in the Wild database (LSEMSW). LSEMSW includes a variety of cognitive states as well as basic emotions. It contains 176K images, manually annotated with 13 emotions, and thus provides the first subtle expression dataset large enough for training deep CNNs. Evaluations on LSEMSW and 300-W (landmark) databases show the effectiveness of the proposed methods. In addition, we investigate transferring knowledge learned from LSEMSW database to traditional (non-subtle) expression recognition. We achieve very competitive performance on Oulu-Casia NIR&Vis and CK+ databases via transfer learning.
|Title of host publication||European Conference on Computer Vision 2018: Proceedings|
|Number of pages||18|
|Publication status||Published - 06 Oct 2018|
|Name||Lecture Notes in Computer Science|
Hu, G., Liu, L., Yuan, Y., Yu, Z., Hua, Y., Zhang, Z., Shen, F., Shao, L., Hospedales, T., Robertson, N., & Yang, Y. (2018). Deep Multi-Task Learning to Recognise Subtle Facial Expressions of Mental States. In European Conference on Computer Vision 2018: Proceedings (pp. 106-123). (Lecture Notes in Computer Science; Vol. 11216). https://doi.org/10.1007/978-3-030-01258-8_7