Prediction of Listeria monocytogenes clonal complexes from multilocus variable number tandem repeat analysis patterns using a machine learning approach

Nicholas Andrews, Natalia Unrath, Patrick Wall, James F Buckley, Séamus Fanning

Research output: Contribution to journalArticlepeer-review

Abstract

Multilocus variable number tandem repeat analysis (MLVA) is a molecular subtyping technique that remains useful for those without the resources to access whole genome sequencing for the tracking and tracing of bacterial contaminants. Unlike techniques such as multilocus sequence typing (MLST) and pulsed-field gel electrophoresis, MLVA did not emerge as a standardized subtyping method for , and as a result, there is no reference database of virulent or food-associated MLVA subtypes as there is for MLST-based clonal complexes (CCs). Having previously shown the close congruence of a 5-loci MLVA scheme with MLST, a predictive model was created using the XGBoost machine learning (ML) technique, which enabled the prediction of CCs from MLVA patterns with ∼85% (±4%) accuracy. As well as validating the model on existing data, a straightforward update protocol was simulated for if and when previously unseen subtypes might arise. This article illustrates how ML techniques can be applied with elementary coding skills to add value to previous-generation molecular subtyping data in-built food processing environments.
Original languageEnglish
JournalFoodborne Pathogens and Disease
Volume21
Issue number9
Early online date04 Jul 2024
DOIs
Publication statusEarly online date - 04 Jul 2024

Keywords

  • food processing
  • Listeria monocytogenes
  • machine learning
  • MLVA
  • XGBoost

Fingerprint

Dive into the research topics of 'Prediction of Listeria monocytogenes clonal complexes from multilocus variable number tandem repeat analysis patterns using a machine learning approach'. Together they form a unique fingerprint.

Cite this