Recognising an object involves rapid visual processing and activation of semantic knowledge about the object, but how visual processing activates and interacts with semantic representations remains unclear. Cognitive neuroscience research has shown that while visual processing involves posterior regions along the ventral stream, object meaning involves more anterior regions, especially perirhinal cortex. Here we investigate visuo-semantic processing by combining a deep neural network model of vision with an attractor network model of semantics, such that visual information maps onto object meanings represented as activation patterns across features. In the combined model, concept activation is driven by visual input and co-occurrence of semantic features, consistent with neurocognitive accounts. We tested the model's ability to explain fMRI data where participants named objects. Visual layers explained activation patterns in early visual cortex, whereas pattern-information in perirhinal cortex was best explained by later stages of the attractor network, when detailed semantic representations are activated. Posterior ventral temporal cortex was best explained by intermediate stages corresponding to initial semantic processing, when visual information has the greatest influence on the emerging semantic representation. These results provide proof of principle of how a mechanistic model of combined visuo-semantic processing can account for pattern-information in the ventral stream.