Cargando…

FastGroup: A program to dereplicate libraries of 16S rDNA sequences

BACKGROUND: Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we...

Descripción completa

Detalles Bibliográficos
Autores principales: Seguritan, Victor, Rohwer, Forest
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2001
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59723/
https://www.ncbi.nlm.nih.gov/pubmed/11707150
http://dx.doi.org/10.1186/1471-2105-2-9
_version_ 1782120102564462592
author Seguritan, Victor
Rohwer, Forest
author_facet Seguritan, Victor
Rohwer, Forest
author_sort Seguritan, Victor
collection PubMed
description BACKGROUND: Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, a Java program designed to dereplicate libraries of 16S rDNA sequences. By dereplication we mean to: 1) compare all the sequences in a data set to each other, 2) group similar sequences together, and 3) output a representative sequence from each group. In this way, duplicate sequences are removed from a library. RESULTS: FastGroup was tested using a library of single-pass, bacterial 16S rDNA sequences cloned from coral-associated bacteria. We found that the optimal strategy for dereplicating these sequences was to: 1) trim ambiguous bases from the 5' end of the sequences and all sequence 3' of the conserved Bact517 site, 2) match the sequences from the 3' end, and 3) group sequences >=97% identical to each other. CONCLUSIONS: The FastGroup program simplifies the dereplication of 16S rDNA sequence libraries and prepares the raw sequences for subsequent analyses.
format Text
id pubmed-59723
institution National Center for Biotechnology Information
language English
publishDate 2001
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-597232001-11-14 FastGroup: A program to dereplicate libraries of 16S rDNA sequences Seguritan, Victor Rohwer, Forest BMC Bioinformatics Methodology Article BACKGROUND: Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, a Java program designed to dereplicate libraries of 16S rDNA sequences. By dereplication we mean to: 1) compare all the sequences in a data set to each other, 2) group similar sequences together, and 3) output a representative sequence from each group. In this way, duplicate sequences are removed from a library. RESULTS: FastGroup was tested using a library of single-pass, bacterial 16S rDNA sequences cloned from coral-associated bacteria. We found that the optimal strategy for dereplicating these sequences was to: 1) trim ambiguous bases from the 5' end of the sequences and all sequence 3' of the conserved Bact517 site, 2) match the sequences from the 3' end, and 3) group sequences >=97% identical to each other. CONCLUSIONS: The FastGroup program simplifies the dereplication of 16S rDNA sequence libraries and prepares the raw sequences for subsequent analyses. BioMed Central 2001-10-16 /pmc/articles/PMC59723/ /pubmed/11707150 http://dx.doi.org/10.1186/1471-2105-2-9 Text en Copyright © 2001 Seguritan and Rohwer; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Methodology Article
Seguritan, Victor
Rohwer, Forest
FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_full FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_fullStr FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_full_unstemmed FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_short FastGroup: A program to dereplicate libraries of 16S rDNA sequences
title_sort fastgroup: a program to dereplicate libraries of 16s rdna sequences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59723/
https://www.ncbi.nlm.nih.gov/pubmed/11707150
http://dx.doi.org/10.1186/1471-2105-2-9
work_keys_str_mv AT seguritanvictor fastgroupaprogramtodereplicatelibrariesof16srdnasequences
AT rohwerforest fastgroupaprogramtodereplicatelibrariesof16srdnasequences