Cargando…

Rapid analysis of metagenomic data using signature-based clustering

BACKGROUND: Sequencing highly-variable 16S regions is a common and often effective approach to the study of microbial communities, and next-generation sequencing (NGS) technologies provide abundant quantities of data for analysis. However, the speed of existing analysis pipelines may limit our abili...

Descripción completa

Detalles Bibliográficos
Autores principales: Chappell, Timothy, Geva, Shlomo, Hogan, James M., Huygens, Flavia, Rathnayake, Irani U., Rudd, Stephen, Kelly, Wayne, Perrin, Dimitri
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302383/
https://www.ncbi.nlm.nih.gov/pubmed/30577803
http://dx.doi.org/10.1186/s12859-018-2540-4
_version_ 1783381966091976704
author Chappell, Timothy
Geva, Shlomo
Hogan, James M.
Huygens, Flavia
Rathnayake, Irani U.
Rudd, Stephen
Kelly, Wayne
Perrin, Dimitri
author_facet Chappell, Timothy
Geva, Shlomo
Hogan, James M.
Huygens, Flavia
Rathnayake, Irani U.
Rudd, Stephen
Kelly, Wayne
Perrin, Dimitri
author_sort Chappell, Timothy
collection PubMed
description BACKGROUND: Sequencing highly-variable 16S regions is a common and often effective approach to the study of microbial communities, and next-generation sequencing (NGS) technologies provide abundant quantities of data for analysis. However, the speed of existing analysis pipelines may limit our ability to work with these quantities of data. Furthermore, the limited coverage of existing 16S databases may hamper our ability to characterise these communities, particularly in the context of complex or poorly studied environments. RESULTS: In this article we present the SigClust algorithm, a novel clustering method involving the transformation of sequence reads into binary signatures. When compared to other published methods, SigClust yields superior cluster coherence and separation of metagenomic read data, while operating within substantially reduced timeframes. We demonstrate its utility on published Illumina datasets and on a large collection of labelled wound reads sourced from patients in a wound clinic. The temporal analysis is based on tracking the dominant clusters of wound samples over time. The analysis can identify markers of both healing and non-healing wounds in response to treatment. Prominent clusters are found, corresponding to bacterial species known to be associated with unfavourable healing outcomes, including a number of strains of Staphylococcus aureus. CONCLUSIONS: SigClust identifies clusters rapidly and supports an improved understanding of the wound microbiome without reliance on a reference database. The results indicate a promising use for a SigClust-based pipeline in wound analysis and prediction, and a possible novel method for wound management and treatment.
format Online
Article
Text
id pubmed-6302383
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63023832018-12-31 Rapid analysis of metagenomic data using signature-based clustering Chappell, Timothy Geva, Shlomo Hogan, James M. Huygens, Flavia Rathnayake, Irani U. Rudd, Stephen Kelly, Wayne Perrin, Dimitri BMC Bioinformatics Research BACKGROUND: Sequencing highly-variable 16S regions is a common and often effective approach to the study of microbial communities, and next-generation sequencing (NGS) technologies provide abundant quantities of data for analysis. However, the speed of existing analysis pipelines may limit our ability to work with these quantities of data. Furthermore, the limited coverage of existing 16S databases may hamper our ability to characterise these communities, particularly in the context of complex or poorly studied environments. RESULTS: In this article we present the SigClust algorithm, a novel clustering method involving the transformation of sequence reads into binary signatures. When compared to other published methods, SigClust yields superior cluster coherence and separation of metagenomic read data, while operating within substantially reduced timeframes. We demonstrate its utility on published Illumina datasets and on a large collection of labelled wound reads sourced from patients in a wound clinic. The temporal analysis is based on tracking the dominant clusters of wound samples over time. The analysis can identify markers of both healing and non-healing wounds in response to treatment. Prominent clusters are found, corresponding to bacterial species known to be associated with unfavourable healing outcomes, including a number of strains of Staphylococcus aureus. CONCLUSIONS: SigClust identifies clusters rapidly and supports an improved understanding of the wound microbiome without reliance on a reference database. The results indicate a promising use for a SigClust-based pipeline in wound analysis and prediction, and a possible novel method for wound management and treatment. BioMed Central 2018-12-21 /pmc/articles/PMC6302383/ /pubmed/30577803 http://dx.doi.org/10.1186/s12859-018-2540-4 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chappell, Timothy
Geva, Shlomo
Hogan, James M.
Huygens, Flavia
Rathnayake, Irani U.
Rudd, Stephen
Kelly, Wayne
Perrin, Dimitri
Rapid analysis of metagenomic data using signature-based clustering
title Rapid analysis of metagenomic data using signature-based clustering
title_full Rapid analysis of metagenomic data using signature-based clustering
title_fullStr Rapid analysis of metagenomic data using signature-based clustering
title_full_unstemmed Rapid analysis of metagenomic data using signature-based clustering
title_short Rapid analysis of metagenomic data using signature-based clustering
title_sort rapid analysis of metagenomic data using signature-based clustering
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302383/
https://www.ncbi.nlm.nih.gov/pubmed/30577803
http://dx.doi.org/10.1186/s12859-018-2540-4
work_keys_str_mv AT chappelltimothy rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT gevashlomo rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT hoganjamesm rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT huygensflavia rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT rathnayakeiraniu rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT ruddstephen rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT kellywayne rapidanalysisofmetagenomicdatausingsignaturebasedclustering
AT perrindimitri rapidanalysisofmetagenomicdatausingsignaturebasedclustering