Applications of machine learning in chronic respiratory infection

  • Andrew Thomas John England

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

The development of powerful analytics and machine learning (ML) in the 21st century has led to significant advances across a range of fields, including biological and clinical settings. However, applications have typically been limited to narrow contexts, relying on large, high-quality, homogenous, numeric, and highly structured databases. Furthermore, ML approaches often lack a temporal dimension and prioritise predictive accuracy over clinical interpretability.

The aims of this thesis were to: (i) investigate contrasting respiratory databases to better understand key challenges in applying ML to chronic respiratory disease (CRD) datasets, while developing a robust pipeline for modelling clinical measures; (ii) use these findings to evaluate clinical outcomes and treatment efficacies from clinical trials and Electronic Health Records (EHRs) of adults with cystic fibrosis (CF) and bronchiectasis—specifically examining longitudinal effects of antibiotics on Pseudomonas abundance in respiratory samples, and on lung function, as well as assessing the impact of CFTR modulators on cardiovascular risk and inflammatory markers using a large CF EHR dataset; and (iii) apply artificial intelligence (AI) to a database of surface-enhanced Raman spectroscopy (SERS) signals from cultured bacterial samples to develop a methodology for species- and strain-level bacterial identification, and compare ML methods in relation to intra-species and intra-strain variance due to growth time.
Findings revealed that clinical trial data yielded limited new insights, generally aligning with previous analyses. However, important longitudinal associations were identified between lung function and Pseudomonas abundance. Combination CFTR modulator therapy was linked to increased cardiovascular risk, primarily from weight gain. CRP levels decreased and stabilised with treatment. Finally, convolutional neural networks applied to SERS data demonstrated high accuracy in species-level identification across growth times, though not sufficient for reliable strain-level classification of Pseudomonas aeruginosa.

Overall, while ML shows strong potential in CRD research, substantial challenges remain in applying it to longitudinal and clinical data.

Thesis is embargoed until 31 July 2030.
Date of AwardJul 2025
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SponsorsInnovative Medicines Initiative & European Federation of Pharmaceutical Industries and Association
SupervisorMichael Tunney (Supervisor), Joseph Elborn (Supervisor) & Barry Devereux (Supervisor)

Keywords

  • Cystic Fibrosis (CF)
  • bronchiectasis
  • machine Learning
  • artificial intelligence
  • data analysis
  • tobramycin
  • CFTR modulators
  • SERS
  • clinical trials

Cite this

'