Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning

Korsuk Sirinukunwattana, Enric Domingo, Susan D. Richman, Keara L. Redmond, Andrew Blake, Clare Verrill, Simon J. Leedham, Aikaterini Chatzipli, Claire Hardy, Celina M. Whalley, Chieh Hsi Wu, Andrew D. Beggs, Ultan McDermott, Philip D. Dunne, Angela Meade, Steven M. Walker, Graeme I. Murray, Leslie Samuel, Matthew Seymour, Ian TomlinsonPhil Quirke, Timothy Maughan, Jens Rittscher, Viktor H. Koelzer*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)
44 Downloads (Pure)


Objective Complex phenotypes captured on histological slides represent the biological processes at play in individual cancers, but the link to underlying molecular classification has not been clarified or systematised. In colorectal cancer (CRC), histological grading is a poor predictor of disease progression, and consensus molecular subtypes (CMSs) cannot be distinguished without gene expression profiling. We hypothesise that image analysis is a cost-effective tool to associate complex features of tissue organisation with molecular and outcome data and to resolve unclassifiable or heterogeneous cases. In this study, we present an image-based approach to predict CRC CMS from standard H&E sections using deep learning. Design Training and evaluation of a neural network were performed using a total of n=1206 tissue sections with comprehensive multi-omic data from three independent datasets (training on FOCUS trial, n=278 patients; test on rectal cancer biopsies, GRAMPIAN cohort, n=144 patients; and The Cancer Genome Atlas (TCGA), n=430 patients). Ground truth CMS calls were ascertained by matching random forest and single sample predictions from CMS classifier. Results Image-based CMS (imCMS) accurately classified slides in unseen datasets from TCGA (n=431 slides, AUC)=0.84) and rectal cancer biopsies (n=265 slides, AUC=0.85). imCMS spatially resolved intratumoural heterogeneity and provided secondary calls correlating with bioinformatic prediction from molecular data. imCMS classified samples previously unclassifiable by RNA expression profiling, reproduced the expected correlations with genomic and epigenetic alterations and showed similar prognostic associations as transcriptomic CMS. Conclusion This study shows that a prediction of RNA expression classifiers can be made from H&E images, opening the door to simple, cheap and reliable biological stratification within routine workflows.

Original languageEnglish
Pages (from-to)544-554
Number of pages11
Issue number3
Early online date20 Jul 2020
Publication statusPublished - Mar 2021

Bibliographical note

Funding Information:
Funding The S:CORT consortium is a Medical Research Council stratified medicine consortium jointly funded by the MRC and CRUK. This work was further supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre. Computation used the Oxford Biomedical Research Computing (BMRC) facility, a joint development between the Wellcome Centre for Human Genetics and the Big Data Institute supported by Health Data Research UK and the NIHR Oxford Biomedical Research Centre. JR is supported through the EPSRC funded Seebibyte programme (EP/M013774/1). JR is adjunct professor of the Ludwig Oxford Branch. VHK gratefully acknowledges funding by the Swiss National Science Foundation (P2SKP3_168322/1 and P2SKP3_168322/2), and the Promedica Foundation F-87701-41-01.

Publisher Copyright:

Copyright 2021 Elsevier B.V., All rights reserved.


  • colorectal pathology
  • computerised image analysis
  • molecular pathology

ASJC Scopus subject areas

  • Gastroenterology


Dive into the research topics of 'Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning'. Together they form a unique fingerprint.

Cite this