Abstract
Person re-identification involves recognizing a person
across non-overlapping camera views, with different pose,
illumination, and camera characteristics. We propose to tackle
this problem by training a deep convolutional network to represent
a person’s appearance as a low-dimensional feature vector
that is invariant to common appearance variations encountered
in the re-identification problem. Specifically, a Siamese-network
architecture is used to train a feature extraction network using
pairs of similar and dissimilar images. We show that use of a
novel multi-task learning objective is crucial for regularizing the
network parameters in order to prevent over-fitting due to the
small size the training dataset. We complement the verification
task, which is at the heart of re-identification, by training the
network to jointly perform verification, identification, and to
recognise attributes related to the clothing and pose of the person
in each image. Additionally, we show that our proposed approach
performs well even in the challenging cross-dataset scenario,
which may better reflect real-world expected performance.
Original language | English |
---|---|
Pages (from-to) | 525-539 |
Number of pages | 14 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Volume | 27 |
Issue number | 3 |
Early online date | 20 Oct 2016 |
DOIs | |
Publication status | Published - Mar 2017 |