Cargando…

Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data

BACKGROUND: The rapid expansion of 16S rRNA gene sequencing in challenging clinical contexts has resulted in a growing body of literature of variable quality. To a large extent, this is due to a failure to address spurious signal that is characteristic of samples with low levels of bacteria and high...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jervis-Bardy, Jake, Leong, Lex E X, Marri, Shashikanth, Smith, Renee J, Choo, Jocelyn M, Smith-Vaughan, Heidi C, Nosworthy, Elizabeth, Morris, Peter S, O’Leary, Stephen, Rogers, Geraint B, Marsh, Robyn L
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2015
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428251/ https://www.ncbi.nlm.nih.gov/pubmed/25969736 http://dx.doi.org/10.1186/s40168-015-0083-8

_version_	1782370863633399808
author	Jervis-Bardy, Jake Leong, Lex E X Marri, Shashikanth Smith, Renee J Choo, Jocelyn M Smith-Vaughan, Heidi C Nosworthy, Elizabeth Morris, Peter S O’Leary, Stephen Rogers, Geraint B Marsh, Robyn L
author_facet	Jervis-Bardy, Jake Leong, Lex E X Marri, Shashikanth Smith, Renee J Choo, Jocelyn M Smith-Vaughan, Heidi C Nosworthy, Elizabeth Morris, Peter S O’Leary, Stephen Rogers, Geraint B Marsh, Robyn L
author_sort	Jervis-Bardy, Jake
collection	PubMed
description	BACKGROUND: The rapid expansion of 16S rRNA gene sequencing in challenging clinical contexts has resulted in a growing body of literature of variable quality. To a large extent, this is due to a failure to address spurious signal that is characteristic of samples with low levels of bacteria and high levels of non-bacterial DNA. We have developed a workflow based on the paired-end read Illumina MiSeq-based approach, which enables significant improvement in data quality, post-sequencing. We demonstrate the efficacy of this methodology through its application to paediatric upper-respiratory samples from several anatomical sites. RESULTS: A workflow for processing sequence data was developed based on commonly available tools. Data generated from different sample types showed a marked variation in levels of non-bacterial signal and ‘contaminant’ bacterial reads. Significant differences in the ability of reference databases to accurately assign identity to operational taxonomic units (OTU) were observed. Three OTU-picking strategies were trialled as follows: de novo, open-reference and closed-reference, with open-reference performing substantially better. Relative abundance of OTUs identified as potential reagent contamination showed a strong inverse correlation with amplicon concentration allowing their objective removal. The removal of the spurious signal showed the greatest improvement in sample types typically containing low levels of bacteria and high levels of human DNA. A substantial impact of pre-filtering data and spurious signal removal was demonstrated by principal coordinate and co-occurrence analysis. For example, analysis of taxon co-occurrence in adenoid swab and middle ear fluid samples indicated that failure to remove the spurious signal resulted in the inclusion of six out of eleven bacterial genera that accounted for 80% of similarity between the sample types. CONCLUSIONS: The application of the presented workflow to a set of challenging clinical samples demonstrates its utility in removing the spurious signal from the dataset, allowing clinical insight to be derived from what would otherwise be highly misleading output. While other approaches could potentially achieve similar improvements, the methodology employed here represents an accessible means to exclude the signal from contamination and other artefacts. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0083-8) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4428251
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-44282512015-05-13 Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data Jervis-Bardy, Jake Leong, Lex E X Marri, Shashikanth Smith, Renee J Choo, Jocelyn M Smith-Vaughan, Heidi C Nosworthy, Elizabeth Morris, Peter S O’Leary, Stephen Rogers, Geraint B Marsh, Robyn L Microbiome Methodology BACKGROUND: The rapid expansion of 16S rRNA gene sequencing in challenging clinical contexts has resulted in a growing body of literature of variable quality. To a large extent, this is due to a failure to address spurious signal that is characteristic of samples with low levels of bacteria and high levels of non-bacterial DNA. We have developed a workflow based on the paired-end read Illumina MiSeq-based approach, which enables significant improvement in data quality, post-sequencing. We demonstrate the efficacy of this methodology through its application to paediatric upper-respiratory samples from several anatomical sites. RESULTS: A workflow for processing sequence data was developed based on commonly available tools. Data generated from different sample types showed a marked variation in levels of non-bacterial signal and ‘contaminant’ bacterial reads. Significant differences in the ability of reference databases to accurately assign identity to operational taxonomic units (OTU) were observed. Three OTU-picking strategies were trialled as follows: de novo, open-reference and closed-reference, with open-reference performing substantially better. Relative abundance of OTUs identified as potential reagent contamination showed a strong inverse correlation with amplicon concentration allowing their objective removal. The removal of the spurious signal showed the greatest improvement in sample types typically containing low levels of bacteria and high levels of human DNA. A substantial impact of pre-filtering data and spurious signal removal was demonstrated by principal coordinate and co-occurrence analysis. For example, analysis of taxon co-occurrence in adenoid swab and middle ear fluid samples indicated that failure to remove the spurious signal resulted in the inclusion of six out of eleven bacterial genera that accounted for 80% of similarity between the sample types. CONCLUSIONS: The application of the presented workflow to a set of challenging clinical samples demonstrates its utility in removing the spurious signal from the dataset, allowing clinical insight to be derived from what would otherwise be highly misleading output. While other approaches could potentially achieve similar improvements, the methodology employed here represents an accessible means to exclude the signal from contamination and other artefacts. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0083-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-05-05 /pmc/articles/PMC4428251/ /pubmed/25969736 http://dx.doi.org/10.1186/s40168-015-0083-8 Text en © Jervis-Bardy et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Jervis-Bardy, Jake Leong, Lex E X Marri, Shashikanth Smith, Renee J Choo, Jocelyn M Smith-Vaughan, Heidi C Nosworthy, Elizabeth Morris, Peter S O’Leary, Stephen Rogers, Geraint B Marsh, Robyn L Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data
title	Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data
title_full	Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data
title_fullStr	Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data
title_full_unstemmed	Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data
title_short	Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data
title_sort	deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of illumina miseq data
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428251/ https://www.ncbi.nlm.nih.gov/pubmed/25969736 http://dx.doi.org/10.1186/s40168-015-0083-8
work_keys_str_mv	AT jervisbardyjake derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT leonglexex derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT marrishashikanth derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT smithreneej derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT choojocelynm derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT smithvaughanheidic derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT nosworthyelizabeth derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT morrispeters derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT olearystephen derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT rogersgeraintb derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata AT marshrobynl derivingaccuratemicrobiotaprofilesfromhumansampleswithlowbacterialcontentthroughpostsequencingprocessingofilluminamiseqdata

Deriving accurate microbiota profiles from human samples with low bacterial content through post-sequencing processing of Illumina MiSeq data

Ejemplares similares