Cargando…

Species classifier choice is a key consideration when analysing low-complexity food microbiome data

BACKGROUND: The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. H...

Descripción completa

Detalles Bibliográficos
Autores principales:	Walsh, Aaron M., Crispie, Fiona, O’Sullivan, Orla, Finnegan, Laura, Claesson, Marcus J., Cotter, Paul D.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859664/ https://www.ncbi.nlm.nih.gov/pubmed/29554948 http://dx.doi.org/10.1186/s40168-018-0437-0

_version_	1783307868216229888
author	Walsh, Aaron M. Crispie, Fiona O’Sullivan, Orla Finnegan, Laura Claesson, Marcus J. Cotter, Paul D.
author_facet	Walsh, Aaron M. Crispie, Fiona O’Sullivan, Orla Finnegan, Laura Claesson, Marcus J. Cotter, Paul D.
author_sort	Walsh, Aaron M.
collection	PubMed
description	BACKGROUND: The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. RESULTS: Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R(2) = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R(2) = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. CONCLUSIONS: Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0437-0) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5859664
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-58596642018-03-22 Species classifier choice is a key consideration when analysing low-complexity food microbiome data Walsh, Aaron M. Crispie, Fiona O’Sullivan, Orla Finnegan, Laura Claesson, Marcus J. Cotter, Paul D. Microbiome Research BACKGROUND: The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. RESULTS: Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R(2) = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R(2) = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. CONCLUSIONS: Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0437-0) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-20 /pmc/articles/PMC5859664/ /pubmed/29554948 http://dx.doi.org/10.1186/s40168-018-0437-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Walsh, Aaron M. Crispie, Fiona O’Sullivan, Orla Finnegan, Laura Claesson, Marcus J. Cotter, Paul D. Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_full	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_fullStr	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_full_unstemmed	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_short	Species classifier choice is a key consideration when analysing low-complexity food microbiome data
title_sort	species classifier choice is a key consideration when analysing low-complexity food microbiome data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859664/ https://www.ncbi.nlm.nih.gov/pubmed/29554948 http://dx.doi.org/10.1186/s40168-018-0437-0
work_keys_str_mv	AT walshaaronm speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT crispiefiona speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT osullivanorla speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT finneganlaura speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT claessonmarcusj speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata AT cotterpauld speciesclassifierchoiceisakeyconsiderationwhenanalysinglowcomplexityfoodmicrobiomedata

Species classifier choice is a key consideration when analysing low-complexity food microbiome data

Ejemplares similares