Strudel: A corpus-based semantic model based on properties and types

M. Baroni, B. Murphy, E. Barbu, M. Poesio

Research output: Contribution to journalArticlepeer-review

73 Citations (Scopus)

Abstract

Computational models of meaning trained on naturally occurring text successfully model human performance on tasks involving simple similarity measures, but they characterize meaning in terms of undifferentiated bags of words or topical dimensions. This has led some to question their psychological plausibility (Murphy, 2002; Schunn, 1999). We present here a fully automatic method for extracting a structured and comprehensive set of concept descriptions directly from an English part-of-speech-tagged corpus. Concepts are characterized by weighted properties, enriched with concept-property types that approximate classical relations such as hypernymy and function. Our model outperforms comparable algorithms in cognitive tasks pertaining not only to concept-internal structures (discovering properties of concepts, grouping properties by property type) but also to inter-concept relations (clustering into superordinates), suggesting the empirical validity of the property-based approach.
Original languageEnglish
Pages (from-to)222-254
Number of pages33
JournalCognitive Science
Volume34
Issue number2
DOIs
Publication statusPublished - 01 Mar 2010

Fingerprint

Dive into the research topics of 'Strudel: A corpus-based semantic model based on properties and types'. Together they form a unique fingerprint.

Cite this