Unsupervised visual domain adaptation by self-guided learning

  • Jian Gao

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

Domain Adaptation (DA) studies how to improve model performance on a target domain, with additional training data from one or more source domain(s) that are under different distributions with the target domain.
The most commonly studied setting of domain adaptation in recent years is the standard Unsupervised Domain Adaptation (UDA), assuming that unlabelled target domain is accompanied by labelled source domain(s) during learning. Specifically, we study UDA in the field of computer vision, namely Unsupervised Visual Domain Adaptation (UVDA).

This thesis pins down two essential aspects of Unsupervised Visual Domain Adaptation: 1) how to mitigate the distributional gap that is prominent between the source domain and target domain, and learn a generalised domain-invariant feature representation; 2) how to effectively learn from the unlabelled target domain data in existence of the source domain(s). One major challenge here is that the distributional gap in a UVDA problem is different for different tasks, yet it is hard to be quantified and interpreted. For a general solution that is able to work in different applications and tasks, it must be capable of identifying the different levels of the distributional gap that could occur, as well as adapting in accordance with it. On the other hand, the lack of supervision in the target domain puts an even stronger emphasis on learning a generalised representation. Therefore, we propose a series of mechanisms that directly find the answers from the data to tackle these challenges, which we name it as self-guided learning.

In the first study, we consider engaging the distributional gap as a key indicator to develop a domain-adaptive data augmentation method for image-based recognition tasks. Data augmentation has been seen as a crucial component in training generalisable deep neural network models. Existing work investigated means such as transforming the source domain samples to an intermediate domain that is visually closer to the target domain, making use of cross-domain style transfer techniques. However, the visual resemblance to the target domain often comes at the price of losing the accurate and even correct semantic contents in the images. To resolve this issue, we propose to generate both semantically accurate and visually-similar-to-the-target samples by mixing the style-transferred images with their original images. Since the quality of the style-transferred images is often dependent on the distributional gap (i.e., the larger the gap, the poorer the quality), we condition the mixing process to be dependent on the gap. The distributional gap is quantified using a statistical discrepancy metric and is used to dynamically guide the mixing step.

Secondly, we investigate a new formulation of the distributional gap from the perspective of information theory. The relevancy of the source domain to the target domain can be interpreted as the reduction in the uncertainty of one variable (target domain distribution) given the another variable (source domain distribution). This relevancy is mathematically estimated by Mutual Information (MI). To enhance adaptation across visually very different domains, we propose an object shape learning objective that is more robust towards appearance variations. The MI-represented distributional gap can again serve as a guidance to automatically balance the network learning between texture and shape features. The proposed method is empirically proven beneficial in overcoming the texture bias and learning a more transferable model for robust UVDA.

We then take a further step by looking into a more generalised and challenging problem in UVDA, which is Unsupervised Representation Learning for Domain Adaptation (URLDA), where both source and target domain are unlabelled. We study promising solutions to this problem, employing recent self-supervised representation learning techniques. While it is observed that negative transfer is prone to happen under the URLDA setting, we conduct empirical analysis of the overall influential factors to it. Besides distributional gap, we also analyse the effects of different unsupervised representation learning methods, the amount of training data etc. on the performance of UVDA. These analyses provide a general guidance for doing adaptation without any labels.
Date of AwardDec 2022
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SponsorsAnyvision (NI) Ltd
SupervisorYang Hua (Supervisor) & Hui Wang (Supervisor)

Keywords

  • Domain adaptation

Cite this

'