TY - JOUR
T1 - Efficient and effective multi-modal queries through heterogeneous network embedding
AU - Duong, Chi Thang
AU - Nguyen, Thanh Tam
AU - Yin, Hongzhi
AU - Weidlich, Matthias
AU - Mai, Thai Son
AU - Aberer, Karl
AU - Nguyen, Quoc Viet Hung
PY - 2022/11/1
Y1 - 2022/11/1
N2 - The heterogeneity of today’s Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user’s information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-modal queries. However, depending on the chosen operationalisation, such an approach is inefficient or ineffective. It either requires multiple passes over the data or leads to inaccuracies since the relations between data modalities are neglected in the relevance assessment. To mitigate these challenges, we present an IR system that has been designed to answer genuine multi-modal queries. It relies on a heterogeneous network embedding, so that features from diverse modalities can be incorporated when representing both, a query and the data over which it shall be evaluated. By embedding a query and the data in the same vector space, the relations across modalities are made explicit and exploited for more accurate query evaluation. At the same time, multi-modal queries are answered with a single pass over the data. An experimental evaluation using diverse real-world and synthetic datasets illustrates that our approach returns twice the amount of relevant information compared to baseline techniques, while scaling to large multi-modal databases.
AB - The heterogeneity of today’s Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user’s information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-modal queries. However, depending on the chosen operationalisation, such an approach is inefficient or ineffective. It either requires multiple passes over the data or leads to inaccuracies since the relations between data modalities are neglected in the relevance assessment. To mitigate these challenges, we present an IR system that has been designed to answer genuine multi-modal queries. It relies on a heterogeneous network embedding, so that features from diverse modalities can be incorporated when representing both, a query and the data over which it shall be evaluated. By embedding a query and the data in the same vector space, the relations across modalities are made explicit and exploited for more accurate query evaluation. At the same time, multi-modal queries are answered with a single pass over the data. An experimental evaluation using diverse real-world and synthetic datasets illustrates that our approach returns twice the amount of relevant information compared to baseline techniques, while scaling to large multi-modal databases.
U2 - 10.1109/TKDE.2021.3052871
DO - 10.1109/TKDE.2021.3052871
M3 - Article
SN - 1041-4347
VL - 34
SP - 5307
EP - 5320
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 11
ER -