Quality control of next-generation sequencing data without a reference

Urmi H Trivedi, Timothée Cézard, Stephen Bridgett, Anna Montazam, Jenna Nichols, Mark Blaxter, Karim Gharbi

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

Next-generation sequencing (NGS) technologies have dramatically expanded the breadth of genomics. Genome-scale data, once restricted to a small number of biomedical model organisms, can now be generated for virtually any species at remarkable speed and low cost. Yet non-model organisms often lack a suitable reference to map sequence reads against, making alignment-based quality control (QC) of NGS data more challenging than cases where a well-assembled genome is already available. Here we show that by generating a rapid, non-optimized draft assembly of raw reads, it is possible to obtain reliable and informative QC metrics, thus removing the need for a high quality reference. We use benchmark datasets generated from control samples across a range of genome sizes to illustrate that QC inferences made using draft assemblies are broadly equivalent to those made using a well-established reference, and describe QC tools routinely used in our production facility to assess the quality of NGS data from non-model organisms.

Original languageEnglish
Pages (from-to)111
JournalFrontiers in Genetics
Volume5
DOIs
Publication statusPublished - 06 May 2014
Externally publishedYes

Fingerprint Dive into the research topics of 'Quality control of next-generation sequencing data without a reference'. Together they form a unique fingerprint.

  • Cite this

    Trivedi, U. H., Cézard, T., Bridgett, S., Montazam, A., Nichols, J., Blaxter, M., & Gharbi, K. (2014). Quality control of next-generation sequencing data without a reference. Frontiers in Genetics, 5, 111. https://doi.org/10.3389/fgene.2014.00111