Abstract
Distributional models provide a convenient way of modelling semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic knowledge. Moreover, embeddings are often built from a single source of information (typically text data), even though neurocognitive research suggests that semantics is deeply linked to both language and perception. In this paper, we combine multi-modal information from both text and image-based representations derived from state-of-the-art distributional models to produce sparse, interpretable vectors using Joint Non-Negative Sparse Embedding. Through in-depth analyses comparing these sparse models to human-derived behavioural and neuroimaging data, we demonstrate their ability to predict interpretable linguistic descriptions of human groundtruth semantic knowledge.
Original language | English |
---|---|
Title of host publication | Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2018) |
Pages | 260-270 |
Number of pages | 11 |
Publication status | Published - 31 Oct 2018 |
Event | CoNLL 2018: The SIGNLL Conference on Computational Natural Language Learning - Brussels, Belgium Duration: 31 Oct 2018 → 01 Nov 2018 http://www.conll.org/2018 |
Conference
Conference | CoNLL 2018: The SIGNLL Conference on Computational Natural Language Learning |
---|---|
Abbreviated title | CoNLL 2018 |
Country | Belgium |
City | Brussels |
Period | 31/10/2018 → 01/11/2018 |
Internet address |