Weakly Supervised Salient Object Detection with Spatiotemporal Cascade Neural Networks

Yi Tang, Wenbin Zou, Zhi Jin, Yuhuan Chen, Yang Hua, Xia Li

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)
742 Downloads (Pure)


Recently, deep learning techniques have substantially boosted the performance of salient object detection in still images. However, the salient object detection in videos by using traditional handcrafted features or deep learning features is not fully investigated, probably due to the lack of sufficient manually labeled video data for saliency modeling, especially for the data-driven deep learning. This paper proposes a novel weakly supervised approach to salient object detection in a video, which can learn a robust saliency prediction model by using very limited manually labeled data and a large amount of weakly labeled data that could be easily generated in a supervised approach. Furthermore, we propose a spatiotemporal cascade neural network (SCNN) architecture for saliency modeling, in which two fully convolutional networks are cascaded to evaluate visual saliency from both spatial and temporal cues to lead the optimal video saliency prediction. The proposed approach is extensively evaluated on the widely used challenging datasets, and the experiments demonstrate that our proposed approach substantially outperforms the state-of-the-art salient object detection models. 
Original languageEnglish
Number of pages12
JournalIEEE Transactions on Circuits and Systems for Video Technology
Early online date25 Jul 2018
Publication statusEarly online date - 25 Jul 2018


Dive into the research topics of 'Weakly Supervised Salient Object Detection with Spatiotemporal Cascade Neural Networks'. Together they form a unique fingerprint.

Cite this