Cargando…

Ultra-fast sequence clustering from similarity networks with SiLiX

BACKGROUND: The number of gene sequences that are available for comparative genomics approaches is increasing extremely quickly. A current challenge is to be able to handle this huge amount of sequences in order to build families of homologous sequences in a reasonable time. RESULTS: We present the...

Descripción completa

Detalles Bibliográficos
Autores principales: Miele, Vincent, Penel, Simon, Duret, Laurent
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3095554/
https://www.ncbi.nlm.nih.gov/pubmed/21513511
http://dx.doi.org/10.1186/1471-2105-12-116
_version_ 1782203663270281216
author Miele, Vincent
Penel, Simon
Duret, Laurent
author_facet Miele, Vincent
Penel, Simon
Duret, Laurent
author_sort Miele, Vincent
collection PubMed
description BACKGROUND: The number of gene sequences that are available for comparative genomics approaches is increasing extremely quickly. A current challenge is to be able to handle this huge amount of sequences in order to build families of homologous sequences in a reasonable time. RESULTS: We present the software package SiLiX that implements a novel method which reconsiders single linkage clustering with a graph theoretical approach. A parallel version of the algorithms is also presented. As a demonstration of the ability of our software, we clustered more than 3 millions sequences from about 2 billion BLAST hits in 7 minutes, with a high clustering quality, both in terms of sensitivity and specificity. CONCLUSIONS: Comparing state-of-the-art software, SiLiX presents the best up-to-date capabilities to face the problem of clustering large collections of sequences. SiLiX is freely available at http://lbbe.univ-lyon1.fr/SiLiX.
format Text
id pubmed-3095554
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30955542011-05-17 Ultra-fast sequence clustering from similarity networks with SiLiX Miele, Vincent Penel, Simon Duret, Laurent BMC Bioinformatics Software BACKGROUND: The number of gene sequences that are available for comparative genomics approaches is increasing extremely quickly. A current challenge is to be able to handle this huge amount of sequences in order to build families of homologous sequences in a reasonable time. RESULTS: We present the software package SiLiX that implements a novel method which reconsiders single linkage clustering with a graph theoretical approach. A parallel version of the algorithms is also presented. As a demonstration of the ability of our software, we clustered more than 3 millions sequences from about 2 billion BLAST hits in 7 minutes, with a high clustering quality, both in terms of sensitivity and specificity. CONCLUSIONS: Comparing state-of-the-art software, SiLiX presents the best up-to-date capabilities to face the problem of clustering large collections of sequences. SiLiX is freely available at http://lbbe.univ-lyon1.fr/SiLiX. BioMed Central 2011-04-22 /pmc/articles/PMC3095554/ /pubmed/21513511 http://dx.doi.org/10.1186/1471-2105-12-116 Text en Copyright © 2011 Miele et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Miele, Vincent
Penel, Simon
Duret, Laurent
Ultra-fast sequence clustering from similarity networks with SiLiX
title Ultra-fast sequence clustering from similarity networks with SiLiX
title_full Ultra-fast sequence clustering from similarity networks with SiLiX
title_fullStr Ultra-fast sequence clustering from similarity networks with SiLiX
title_full_unstemmed Ultra-fast sequence clustering from similarity networks with SiLiX
title_short Ultra-fast sequence clustering from similarity networks with SiLiX
title_sort ultra-fast sequence clustering from similarity networks with silix
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3095554/
https://www.ncbi.nlm.nih.gov/pubmed/21513511
http://dx.doi.org/10.1186/1471-2105-12-116
work_keys_str_mv AT mielevincent ultrafastsequenceclusteringfromsimilaritynetworkswithsilix
AT penelsimon ultrafastsequenceclusteringfromsimilaritynetworkswithsilix
AT duretlaurent ultrafastsequenceclusteringfromsimilaritynetworkswithsilix