Deep supervised fused similarity hashing for cross-modal retrieval

Wing Ng, Yongzhi Xu, Xing Tian*, Hui Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Downloads (Pure)

Abstract

The need for cross-modal retrieval increases significantly with the rapid growth of multimedia information on the Internet. However, most of existing cross-modal retrieval methods neglect the correlation between label similarity and intra-modality similarity in common semantic subspace training, which makes the trained common semantic subspace unable to preserve semantic similarity of original data effectively. Therefore, a novel cross-modal hashing method is proposed in this paper, namely, Deep Supervised Fused Similarity Hashing (DSFSH). The DSFSH mainly consists of two parts. Firstly, a fused similarity method is proposed to exploit the intrinsic inter-modality correlation of data while preserving the intra-modality relationship of data at the same time. Secondly, a novel quantization max-margin loss is proposed. The gap between cosine similarity and Hamming similarity is closed by minimizing this loss. Extensive experimental results on three benchmark datasets show that the proposed method yields better retrieval performance comparing to state-of-the-art methods.
Original languageEnglish
JournalMultimedia Tools and Applications
Early online date21 Jun 2024
DOIs
Publication statusEarly online date - 21 Jun 2024

Publications and Copyright Policy

This work is licensed under Queen’s Research Publications and Copyright Policy.

Fingerprint

Dive into the research topics of 'Deep supervised fused similarity hashing for cross-modal retrieval'. Together they form a unique fingerprint.

Cite this