Abstract
The changing camera viewpoint on full-body pedestrians in a multi-camera scenario may be problematic to handle, above all if the fields of view are non-overlapping. A direct effect of the viewpoint variability is that a pair of images of the same person shot by different cameras may appear to be more distant from each other in the feature space than one of them from an image of a different identity captured by the same camera. In order to tackle this problem, we propose to train a state-of-the-art CNN by two new loss functions that jointly increase the inter-class discriminative power of the deep features and their intra-class compactness. In particular, one loss function promotes the aggregation of the feature points around the centres of the view they belong to, within the scope of their own identity. The second loss encourages to push away from each other the feature clusters corresponding simultaneously to different views and different identities. Under the supervision of the two new objectives we achieve state-of-the-art accuracy with ResNet50 on Market-1501 and CUHK03 datasets, beating the performance of the softmax loss.
Original language | English |
---|---|
Title of host publication | Irish Machine Vision and Image Processing Conference 2017: Proceedings |
Publisher | Irish Pattern Recognition & Classification Society |
ISBN (Print) | ISBN 978-0-9934207-2-6 |
Publication status | Early online date - 01 Jul 2017 |
Event | 19th Irish Machine Vision and Image Processing Conference 2017 - Maynooth University, Maynooth, Ireland Duration: 30 Aug 2017 → 01 Sept 2017 http://imvip2017.cs.nuim.ie/ |
Conference
Conference | 19th Irish Machine Vision and Image Processing Conference 2017 |
---|---|
Abbreviated title | IMVIP 2017 |
Country/Territory | Ireland |
City | Maynooth |
Period | 30/08/2017 → 01/09/2017 |
Internet address |