Convolutional neural networks (CNNs) have achieved great success in several face-related tasks, such as face detection, alignment and recognition. As a fundamental problem in computer vision, face tracking plays a crucial role in various applications, such as video surveillance, human emotion detection and human-computer interaction. However, few CNN-based approaches are proposed for face (bounding box) tracking. In this article, we propose a face tracking method based on Siamese CNNs, which takes advantages of powerful representations of hierarchical CNN features learned from massive face images. The proposed method captures discriminative face information at both local and global levels. At the local level, representations for attribute patches (i.e., eyes, nose and mouth) are learned to distinguish a face from another one, which are robust to pose changes and occlusions. At the global level, representations for each whole face are learned, which take into account the spatial relationships among local patches and facial characters, such as skin color and nevus. In addition, we build a new large-scale challenging face tracking dataset to evaluate face tracking methods and to facilitate the research forward in this field. Extensive experiments on the collected dataset demonstrate the effectiveness of our method in comparison to several state-of-the-art visual tracking methods.
Bibliographical noteFunding Information:
Manuscript received July 21, 2019; revised May 30, 2020 and August 16, 2020; accepted August 30, 2020. Date of publication September 17, 2020; date of current version September 29, 2020. This work was supported in part by the National Natural Science Foundation of China under Grant 61902092 and Grant 61872112; in part by the National Key Research and Development Program of China under Grant 2018YFC0806802 and Grant 2018YFC0832105; and in part by the Fundamental Research Funds for the Central Universities under Grant HIT.NSRIF.2020005. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Hichem Sahbi. (Corresponding author: Shengping Zhang.) Yuankai Qi and Shengping Zhang are with the School of Computer Science and Technology, Harbin Institute of Technology, Weihai 264209, China (e-mail: firstname.lastname@example.org; email@example.com).
© 1992-2012 IEEE.
Copyright 2020 Elsevier B.V., All rights reserved.
- Correlation filter
- Face bounding box tracking
- Local and global CNN representations
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design