Multi-modal object detection using unsupervised transfer learning and adaptation techniques

Rachael Abbott, Neil Robertson, Jesus Martinez del Rincon, Barry Connor

Research output: Contribution to conferencePaperpeer-review

334 Downloads (Pure)


Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the infrared modality offers persistent surveillance and is essential in poor lighting conditions. It is, therefore, crucial to creating an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise the abundance of already labelled RGB data currently available to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train Faster-RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life urban scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function which reduces the difference between RGB and IR feature maps. This work shows unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR modality, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotations.
Original languageEnglish
Publication statusAccepted - 27 May 2019
EventSPIE Security+Defence: Artificial Intelligence and Machine Learning in Defense Applications - Palais de la Musique et des Congrès, Strasbourg, France
Duration: 10 Sept 201912 Sept 2019
Conference number: 11169


ConferenceSPIE Security+Defence
Internet address


Dive into the research topics of 'Multi-modal object detection using unsupervised transfer learning and adaptation techniques'. Together they form a unique fingerprint.

Cite this