Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the infrared modality offers persistent surveillance and is essential in poor lighting conditions. It is, therefore, crucial to creating an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise the abundance of already labelled RGB data currently available to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train Faster-RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life urban scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function which reduces the difference between RGB and IR feature maps. This work shows unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR modality, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotations.
|Publication status||Accepted - 27 May 2019|
|Event||SPIE Security+Defence: Artificial Intelligence and Machine Learning in Defense Applications - Palais de la Musique et des Congrès, Strasbourg, France|
Duration: 10 Sep 2019 → 12 Sep 2019
Conference number: 11169
|Period||10/09/2019 → 12/09/2019|
Abbott, R., Robertson, N., Martinez del Rincon, J., & Connor, B. (Accepted/In press). Multi-modal object detection using unsupervised transfer learning and adaptation techniques. Paper presented at SPIE Security+Defence, Strasbourg, France.