A novel genomics and bioinformatics approach to assess immunoglobulin and T cell receptor rearrangements and somatic hypermutation in lymphoproliferative disorders

  • Neil McCafferty

Student thesis: Doctoral ThesisDoctor of Philosophy


The adaptive immune system provides antigen-specific immune responses through B and T lymphocytes (B and T cells) with specialised cell surface receptors. Antigens are developed de novo for specific antigen-mediated immune responses using recombination from vast repertoires and affinity maturation for diversification. Somatic hypermutation (SHM) is a B cell specific affinity maturation process whereby point mutations are intentionally inserted into the genomic sequence to marginally alter the confirmation of immunoglobulin (Ig) receptors. SHM is part of normal B cell development and is primarily restricted to germinal centres (GC). SHM has therefore been reported in lymphoproliferative disorders (LPDs) from GC origin. Strong prognostic links and clinical indications based on SHM status have been described in chronic lymphocytic leukaemia (CLL) were patients with mutated Ig present improved prognosis and longer overall survival. Currently, standardised SHM testing is performed by PCR amplification of clonal Ig sequences prior to Sanger sequencing (SSeq). However, next generation sequencing (NGS) presents a promising alternative as it also allows high sample throughput and investigation of structural variation (SV) and mutation analysis.

Several NGS applications have been proposed for clonality and SHM status reporting but these still rely on initial PCR amplification of clonal sequences, limiting analysis of other molecular risk factors. Targeted NGS-capture represents a viable approach to assess IG/TR rearrangements, other SV, single nucleotide variants (SNV) and Indels. However, SHM status has not yet been reported using targeted NGS-capture platforms. The purpose of this thesis is to investigate the use of NGS-capture techniques and novel bioinformatics analysis to accurately report SHM status in accordance with standardised SSeq applications.

To achieve this a novel analytical programme was developed: VCF-SoMAtic, which reports the frequency of somatic mutations in rearranged IGH genes while accounting for polymorphic variants and sequencing artefacts. Identification of clonally rearranged genes was required for this analysis so the effect of wet lab strategies, including probe targeting and NGS read length, were investigated. NGS-capture was shown to accurately detect clonal VDJ recombination by targeting one side of the rearrangement and using lower read lengths, down to 75 bp.

NGS SHM analysis using the 1.5 kb genomic sequence from the clonally rearranged joining gene towards the enhancer region (IGHJ-E) was found to significantly correlate with SSeq IGHV%.

Novel bioinformatic analysis of IGHJ-E and introduction of a stringent 99.8% mutational threshold in tested LPD cohorts found between 88.16-97.44% concordance (90.3% in all 175 samples) with SSeq SHM status. Poor SHM stratification of mutated CLL was observed when IGHV genes were analysed using NGS, which was due to poor alignment of captured reads. In summary, we have created and applied a novel bioinformatic application with targeted NGS to successfully assess SHM status in a range of LPD subtypes using novel analysis of IGHJ-E.
Date of AwardJul 2021
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SponsorsNorthern Ireland Department for the Economy
SupervisorMark Catherwood (Supervisor), Ken Mills (Supervisor) & David Gonzalez de Castro (Supervisor)


  • Somatic hypermutation
  • next generation sequencing
  • lymphoproliferative disorders
  • immunoglobulin
  • bioinformatics
  • chronic lymphocytic leukaemia
  • lymphocyte receptor
  • recombination
  • risk stratification
  • structural variation
  • germinal centre

Cite this