TY - GEN
T1 - Weak-supervised visual geo-localization via attention-based knowledge distillation
AU - Xu, Yifan
AU - Shamsolmoali, Pourya
AU - Yang, Jie
PY - 2022/11/29
Y1 - 2022/11/29
N2 - Visual geo-localization aims to estimate the geo-graphical location of a query image by identifying the best-matched reference image from a GPS-tagged database. It remains a challenging task because of image appearance changes such as lighting, scale and pose. The current approaches do not have satisfactory performance for large-scale environments owing to the lack of learning discriminative features for image matching. To address the above problem, we introduce a practical method to exploit a weak-supervised model with selective transfer for feature distillation. We propose an image matching method that uses image sub-regions to adequately analyze the potential of difficult positive images. For improving the network generations and performance, the model estimates image-to-region similarity labels at no additional parameters or manual annotations by use of soft-labeled loss. Moreover, to have optimal training we propose a novel knowledge distillation (KD) method to effectively capture and transfer knowledge of a teacher network to a student network. More specifically, our method uses an attention network to learn relative similarities within features and utilizes these similarities to enhance the distillation intensities by further exploring the potential of difficult positive images. Our model achieves significant localization performance over large variations of appearance on three challenging datasets with satisfactory efficiency. Our code is available at https://github.com/XuYifan98/WAKD.
AB - Visual geo-localization aims to estimate the geo-graphical location of a query image by identifying the best-matched reference image from a GPS-tagged database. It remains a challenging task because of image appearance changes such as lighting, scale and pose. The current approaches do not have satisfactory performance for large-scale environments owing to the lack of learning discriminative features for image matching. To address the above problem, we introduce a practical method to exploit a weak-supervised model with selective transfer for feature distillation. We propose an image matching method that uses image sub-regions to adequately analyze the potential of difficult positive images. For improving the network generations and performance, the model estimates image-to-region similarity labels at no additional parameters or manual annotations by use of soft-labeled loss. Moreover, to have optimal training we propose a novel knowledge distillation (KD) method to effectively capture and transfer knowledge of a teacher network to a student network. More specifically, our method uses an attention network to learn relative similarities within features and utilizes these similarities to enhance the distillation intensities by further exploring the potential of difficult positive images. Our model achieves significant localization performance over large variations of appearance on three challenging datasets with satisfactory efficiency. Our code is available at https://github.com/XuYifan98/WAKD.
U2 - 10.1109/ICPR56361.2022.9955641
DO - 10.1109/ICPR56361.2022.9955641
M3 - Conference contribution
SN - 9781665490634
T3 - International Conference on Pattern Recognition (ICPR): Proceedings
SP - 1815
EP - 1821
BT - 2022 26th International Conference on Pattern Recognition (ICPR): Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
ER -