Cargando…

The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies

BACKGROUND: Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classific...

Descripción completa

Detalles Bibliográficos
Autores principales: Prosperi, Mattia C. F., De Luca, Andrea, Di Giambenedetto, Simona, Bracciale, Laura, Fabbiani, Massimiliano, Cauda, Roberto, Salemi, Marco
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2963616/
https://www.ncbi.nlm.nih.gov/pubmed/21049051
http://dx.doi.org/10.1371/journal.pone.0013619
_version_ 1782189296204120064
author Prosperi, Mattia C. F.
De Luca, Andrea
Di Giambenedetto, Simona
Bracciale, Laura
Fabbiani, Massimiliano
Cauda, Roberto
Salemi, Marco
author_facet Prosperi, Mattia C. F.
De Luca, Andrea
Di Giambenedetto, Simona
Bracciale, Laura
Fabbiani, Massimiliano
Cauda, Roberto
Salemi, Marco
author_sort Prosperi, Mattia C. F.
collection PubMed
description BACKGROUND: Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classification or infection transmission event identification. We introduce the threshold bootstrap clustering (TBC), a new methodology for partitioning molecular sequences, that does not require a phylogenetic tree estimation. METHODOLOGY/PRINCIPAL FINDINGS: The TBC is an incremental partition algorithm, inspired by the stochastic Chinese restaurant process, and takes advantage of resampling techniques and models of sequence evolution. TBC uses as input a multiple alignment of molecular sequences and its output is a crisp partition of the taxa into an automatically determined number of clusters. By varying initial conditions, the algorithm can produce different partitions. We describe a procedure that selects a prime partition among a set of candidate ones and calculates a measure of cluster reliability. TBC was successfully tested for the identification of type-1 human immunodeficiency and hepatitis C virus subtypes, and compared with previously established methodologies. It was also evaluated in the problem of HIV-1 intra-patient quasispecies clustering, and for transmission cluster identification, using a set of sequences from patients with known transmission event histories. CONCLUSION: TBC has been shown to be effective for the subtyping of HIV and HCV, and for identifying intra-patient quasispecies. To some extent, the algorithm was able also to infer clusters corresponding to events of infection transmission. The computational complexity of TBC is quadratic in the number of taxa, lower than other established methods; in addition, TBC has been enhanced with a measure of cluster reliability. The TBC can be useful to characterise molecular quasipecies in a broad context.
format Text
id pubmed-2963616
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29636162010-11-03 The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies Prosperi, Mattia C. F. De Luca, Andrea Di Giambenedetto, Simona Bracciale, Laura Fabbiani, Massimiliano Cauda, Roberto Salemi, Marco PLoS One Research Article BACKGROUND: Phylogenetic methods produce hierarchies of molecular species, inferring knowledge about taxonomy and evolution. However, there is not yet a consensus methodology that provides a crisp partition of taxa, desirable when considering the problem of intra/inter-patient quasispecies classification or infection transmission event identification. We introduce the threshold bootstrap clustering (TBC), a new methodology for partitioning molecular sequences, that does not require a phylogenetic tree estimation. METHODOLOGY/PRINCIPAL FINDINGS: The TBC is an incremental partition algorithm, inspired by the stochastic Chinese restaurant process, and takes advantage of resampling techniques and models of sequence evolution. TBC uses as input a multiple alignment of molecular sequences and its output is a crisp partition of the taxa into an automatically determined number of clusters. By varying initial conditions, the algorithm can produce different partitions. We describe a procedure that selects a prime partition among a set of candidate ones and calculates a measure of cluster reliability. TBC was successfully tested for the identification of type-1 human immunodeficiency and hepatitis C virus subtypes, and compared with previously established methodologies. It was also evaluated in the problem of HIV-1 intra-patient quasispecies clustering, and for transmission cluster identification, using a set of sequences from patients with known transmission event histories. CONCLUSION: TBC has been shown to be effective for the subtyping of HIV and HCV, and for identifying intra-patient quasispecies. To some extent, the algorithm was able also to infer clusters corresponding to events of infection transmission. The computational complexity of TBC is quadratic in the number of taxa, lower than other established methods; in addition, TBC has been enhanced with a measure of cluster reliability. The TBC can be useful to characterise molecular quasipecies in a broad context. Public Library of Science 2010-10-25 /pmc/articles/PMC2963616/ /pubmed/21049051 http://dx.doi.org/10.1371/journal.pone.0013619 Text en Prosperi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Prosperi, Mattia C. F.
De Luca, Andrea
Di Giambenedetto, Simona
Bracciale, Laura
Fabbiani, Massimiliano
Cauda, Roberto
Salemi, Marco
The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies
title The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies
title_full The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies
title_fullStr The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies
title_full_unstemmed The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies
title_short The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies
title_sort threshold bootstrap clustering: a new approach to find families or transmission clusters within molecular quasispecies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2963616/
https://www.ncbi.nlm.nih.gov/pubmed/21049051
http://dx.doi.org/10.1371/journal.pone.0013619
work_keys_str_mv AT prosperimattiacf thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT delucaandrea thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT digiambenedettosimona thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT braccialelaura thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT fabbianimassimiliano thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT caudaroberto thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT salemimarco thethresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT prosperimattiacf thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT delucaandrea thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT digiambenedettosimona thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT braccialelaura thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT fabbianimassimiliano thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT caudaroberto thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies
AT salemimarco thresholdbootstrapclusteringanewapproachtofindfamiliesortransmissionclusterswithinmolecularquasispecies