Cargando…

Automated analysis of phylogenetic clusters

BACKGROUND: As sequence data sets used for the investigation of pathogen transmission patterns increase in size, automated tools and standardized methods for cluster analysis have become necessary. We have developed an automated Cluster Picker which identifies monophyletic clades meeting user-input...

Descripción completa

Detalles Bibliográficos
Autores principales: Ragonnet-Cronin, Manon, Hodcroft, Emma, Hué, Stéphane, Fearnhill, Esther, Delpech, Valerie, Brown, Andrew J Leigh, Lycett, Samantha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4228337/
https://www.ncbi.nlm.nih.gov/pubmed/24191891
http://dx.doi.org/10.1186/1471-2105-14-317
_version_ 1782343962815627264
author Ragonnet-Cronin, Manon
Hodcroft, Emma
Hué, Stéphane
Fearnhill, Esther
Delpech, Valerie
Brown, Andrew J Leigh
Lycett, Samantha
author_facet Ragonnet-Cronin, Manon
Hodcroft, Emma
Hué, Stéphane
Fearnhill, Esther
Delpech, Valerie
Brown, Andrew J Leigh
Lycett, Samantha
author_sort Ragonnet-Cronin, Manon
collection PubMed
description BACKGROUND: As sequence data sets used for the investigation of pathogen transmission patterns increase in size, automated tools and standardized methods for cluster analysis have become necessary. We have developed an automated Cluster Picker which identifies monophyletic clades meeting user-input criteria for bootstrap support and maximum genetic distance within large phylogenetic trees. A second tool, the Cluster Matcher, automates the process of linking genetic data to epidemiological or clinical data, and matches clusters between runs of the Cluster Picker. RESULTS: We explore the effect of different bootstrap and genetic distance thresholds on clusters identified in a data set of publicly available HIV sequences, and compare these results to those of a previously published tool for cluster identification. To demonstrate their utility, we then use the Cluster Picker and Cluster Matcher together to investigate how clusters in the data set changed over time. We find that clusters containing sequences from more than one UK location at the first time point (multiple origin) were significantly more likely to grow than those representing only a single location. CONCLUSIONS: The Cluster Picker and Cluster Matcher can rapidly process phylogenetic trees containing tens of thousands of sequences. Together these tools will facilitate comparisons of pathogen transmission dynamics between studies and countries.
format Online
Article
Text
id pubmed-4228337
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42283372014-11-13 Automated analysis of phylogenetic clusters Ragonnet-Cronin, Manon Hodcroft, Emma Hué, Stéphane Fearnhill, Esther Delpech, Valerie Brown, Andrew J Leigh Lycett, Samantha BMC Bioinformatics Software BACKGROUND: As sequence data sets used for the investigation of pathogen transmission patterns increase in size, automated tools and standardized methods for cluster analysis have become necessary. We have developed an automated Cluster Picker which identifies monophyletic clades meeting user-input criteria for bootstrap support and maximum genetic distance within large phylogenetic trees. A second tool, the Cluster Matcher, automates the process of linking genetic data to epidemiological or clinical data, and matches clusters between runs of the Cluster Picker. RESULTS: We explore the effect of different bootstrap and genetic distance thresholds on clusters identified in a data set of publicly available HIV sequences, and compare these results to those of a previously published tool for cluster identification. To demonstrate their utility, we then use the Cluster Picker and Cluster Matcher together to investigate how clusters in the data set changed over time. We find that clusters containing sequences from more than one UK location at the first time point (multiple origin) were significantly more likely to grow than those representing only a single location. CONCLUSIONS: The Cluster Picker and Cluster Matcher can rapidly process phylogenetic trees containing tens of thousands of sequences. Together these tools will facilitate comparisons of pathogen transmission dynamics between studies and countries. BioMed Central 2013-11-06 /pmc/articles/PMC4228337/ /pubmed/24191891 http://dx.doi.org/10.1186/1471-2105-14-317 Text en Copyright © 2013 Ragonnet-Cronin et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Ragonnet-Cronin, Manon
Hodcroft, Emma
Hué, Stéphane
Fearnhill, Esther
Delpech, Valerie
Brown, Andrew J Leigh
Lycett, Samantha
Automated analysis of phylogenetic clusters
title Automated analysis of phylogenetic clusters
title_full Automated analysis of phylogenetic clusters
title_fullStr Automated analysis of phylogenetic clusters
title_full_unstemmed Automated analysis of phylogenetic clusters
title_short Automated analysis of phylogenetic clusters
title_sort automated analysis of phylogenetic clusters
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4228337/
https://www.ncbi.nlm.nih.gov/pubmed/24191891
http://dx.doi.org/10.1186/1471-2105-14-317
work_keys_str_mv AT ragonnetcroninmanon automatedanalysisofphylogeneticclusters
AT hodcroftemma automatedanalysisofphylogeneticclusters
AT huestephane automatedanalysisofphylogeneticclusters
AT fearnhillesther automatedanalysisofphylogeneticclusters
AT delpechvalerie automatedanalysisofphylogeneticclusters
AT brownandrewjleigh automatedanalysisofphylogeneticclusters
AT lycettsamantha automatedanalysisofphylogeneticclusters