Cargando…
ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies
TQMD is a tool for high-performance computing clusters which downloads, stores and produces lists of dereplicated prokaryotic genomes. It has been developed to counter the ever-growing number of prokaryotic genomes and their uneven taxonomic distribution. It is based on word-based alignment-free met...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106394/ https://www.ncbi.nlm.nih.gov/pubmed/33996287 http://dx.doi.org/10.7717/peerj.11348 |
_version_ | 1783689768310145024 |
---|---|
author | Léonard, Raphaël R. Leleu, Marie Van Vlierberghe, Mick Cornet, Luc Kerff, Frédéric Baurain, Denis |
author_facet | Léonard, Raphaël R. Leleu, Marie Van Vlierberghe, Mick Cornet, Luc Kerff, Frédéric Baurain, Denis |
author_sort | Léonard, Raphaël R. |
collection | PubMed |
description | TQMD is a tool for high-performance computing clusters which downloads, stores and produces lists of dereplicated prokaryotic genomes. It has been developed to counter the ever-growing number of prokaryotic genomes and their uneven taxonomic distribution. It is based on word-based alignment-free methods (k-mers), an iterative single-linkage approach and a divide-and-conquer strategy to remain both efficient and scalable. We studied the performance of TQMD by verifying the influence of its parameters and heuristics on the clustering outcome. We further compared TQMD to two other dereplication tools (dRep and Assembly-Dereplicator). Our results showed that TQMD is primarily optimized to dereplicate at higher taxonomic levels (phylum/class), as opposed to the other dereplication tools, but also works at lower taxonomic levels (species/strain) like the other dereplication tools. TQMD is available from source and as a Singularity container at [https://bitbucket.org/phylogeno/tqmd ]. |
format | Online Article Text |
id | pubmed-8106394 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-81063942021-05-13 ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies Léonard, Raphaël R. Leleu, Marie Van Vlierberghe, Mick Cornet, Luc Kerff, Frédéric Baurain, Denis PeerJ Bioinformatics TQMD is a tool for high-performance computing clusters which downloads, stores and produces lists of dereplicated prokaryotic genomes. It has been developed to counter the ever-growing number of prokaryotic genomes and their uneven taxonomic distribution. It is based on word-based alignment-free methods (k-mers), an iterative single-linkage approach and a divide-and-conquer strategy to remain both efficient and scalable. We studied the performance of TQMD by verifying the influence of its parameters and heuristics on the clustering outcome. We further compared TQMD to two other dereplication tools (dRep and Assembly-Dereplicator). Our results showed that TQMD is primarily optimized to dereplicate at higher taxonomic levels (phylum/class), as opposed to the other dereplication tools, but also works at lower taxonomic levels (species/strain) like the other dereplication tools. TQMD is available from source and as a Singularity container at [https://bitbucket.org/phylogeno/tqmd ]. PeerJ Inc. 2021-05-05 /pmc/articles/PMC8106394/ /pubmed/33996287 http://dx.doi.org/10.7717/peerj.11348 Text en ©2021 Léonard et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Léonard, Raphaël R. Leleu, Marie Van Vlierberghe, Mick Cornet, Luc Kerff, Frédéric Baurain, Denis ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies |
title | ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies |
title_full | ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies |
title_fullStr | ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies |
title_full_unstemmed | ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies |
title_short | ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies |
title_sort | torquemada: tool for retrieving queried eubacteria, metadata and dereplicating assemblies |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106394/ https://www.ncbi.nlm.nih.gov/pubmed/33996287 http://dx.doi.org/10.7717/peerj.11348 |
work_keys_str_mv | AT leonardraphaelr torquemadatoolforretrievingqueriedeubacteriametadataanddereplicatingassemblies AT leleumarie torquemadatoolforretrievingqueriedeubacteriametadataanddereplicatingassemblies AT vanvlierberghemick torquemadatoolforretrievingqueriedeubacteriametadataanddereplicatingassemblies AT cornetluc torquemadatoolforretrievingqueriedeubacteriametadataanddereplicatingassemblies AT kerfffrederic torquemadatoolforretrievingqueriedeubacteriametadataanddereplicatingassemblies AT bauraindenis torquemadatoolforretrievingqueriedeubacteriametadataanddereplicatingassemblies |