Cargando…

CREST – Classification Resources for Environmental Sequence Tags

Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding...

Descripción completa

Detalles Bibliográficos
Autores principales: Lanzén, Anders, Jørgensen, Steffen L., Huson, Daniel H., Gorfer, Markus, Grindhaug, Svenn Helge, Jonassen, Inge, Øvreås, Lise, Urich, Tim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493522/
https://www.ncbi.nlm.nih.gov/pubmed/23145153
http://dx.doi.org/10.1371/journal.pone.0049334
_version_ 1782249278816649216
author Lanzén, Anders
Jørgensen, Steffen L.
Huson, Daniel H.
Gorfer, Markus
Grindhaug, Svenn Helge
Jonassen, Inge
Øvreås, Lise
Urich, Tim
author_facet Lanzén, Anders
Jørgensen, Steffen L.
Huson, Daniel H.
Gorfer, Markus
Grindhaug, Svenn Helge
Jonassen, Inge
Øvreås, Lise
Urich, Tim
author_sort Lanzén, Anders
collection PubMed
description Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.
format Online
Article
Text
id pubmed-3493522
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34935222012-11-09 CREST – Classification Resources for Environmental Sequence Tags Lanzén, Anders Jørgensen, Steffen L. Huson, Daniel H. Gorfer, Markus Grindhaug, Svenn Helge Jonassen, Inge Øvreås, Lise Urich, Tim PLoS One Research Article Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. Public Library of Science 2012-11-08 /pmc/articles/PMC3493522/ /pubmed/23145153 http://dx.doi.org/10.1371/journal.pone.0049334 Text en © 2012 Lanzén et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lanzén, Anders
Jørgensen, Steffen L.
Huson, Daniel H.
Gorfer, Markus
Grindhaug, Svenn Helge
Jonassen, Inge
Øvreås, Lise
Urich, Tim
CREST – Classification Resources for Environmental Sequence Tags
title CREST – Classification Resources for Environmental Sequence Tags
title_full CREST – Classification Resources for Environmental Sequence Tags
title_fullStr CREST – Classification Resources for Environmental Sequence Tags
title_full_unstemmed CREST – Classification Resources for Environmental Sequence Tags
title_short CREST – Classification Resources for Environmental Sequence Tags
title_sort crest – classification resources for environmental sequence tags
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493522/
https://www.ncbi.nlm.nih.gov/pubmed/23145153
http://dx.doi.org/10.1371/journal.pone.0049334
work_keys_str_mv AT lanzenanders crestclassificationresourcesforenvironmentalsequencetags
AT jørgensensteffenl crestclassificationresourcesforenvironmentalsequencetags
AT husondanielh crestclassificationresourcesforenvironmentalsequencetags
AT gorfermarkus crestclassificationresourcesforenvironmentalsequencetags
AT grindhaugsvennhelge crestclassificationresourcesforenvironmentalsequencetags
AT jonasseninge crestclassificationresourcesforenvironmentalsequencetags
AT øvreaslise crestclassificationresourcesforenvironmentalsequencetags
AT urichtim crestclassificationresourcesforenvironmentalsequencetags