Cargando…

Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads

SUMMARY: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phy...

Descripción completa

Detalles Bibliográficos
Autores principales: Davenport, Colin F., Neugebauer, Jens, Beckmann, Nils, Friedrich, Benedikt, Kameri, Burim, Kokott, Svea, Paetow, Malte, Siekmann, Björn, Wieding-Drewes, Matthias, Wienhöfer, Markus, Wolf, Stefan, Tümmler, Burkhard, Ahlers, Volker, Sprengel, Frauke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424124/
https://www.ncbi.nlm.nih.gov/pubmed/22927906
http://dx.doi.org/10.1371/journal.pone.0041224
_version_ 1782241179568439296
author Davenport, Colin F.
Neugebauer, Jens
Beckmann, Nils
Friedrich, Benedikt
Kameri, Burim
Kokott, Svea
Paetow, Malte
Siekmann, Björn
Wieding-Drewes, Matthias
Wienhöfer, Markus
Wolf, Stefan
Tümmler, Burkhard
Ahlers, Volker
Sprengel, Frauke
author_facet Davenport, Colin F.
Neugebauer, Jens
Beckmann, Nils
Friedrich, Benedikt
Kameri, Burim
Kokott, Svea
Paetow, Malte
Siekmann, Björn
Wieding-Drewes, Matthias
Wienhöfer, Markus
Wolf, Stefan
Tümmler, Burkhard
Ahlers, Volker
Sprengel, Frauke
author_sort Davenport, Colin F.
collection PubMed
description SUMMARY: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer. AVAILABILITY: The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7.
format Online
Article
Text
id pubmed-3424124
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34241242012-08-27 Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads Davenport, Colin F. Neugebauer, Jens Beckmann, Nils Friedrich, Benedikt Kameri, Burim Kokott, Svea Paetow, Malte Siekmann, Björn Wieding-Drewes, Matthias Wienhöfer, Markus Wolf, Stefan Tümmler, Burkhard Ahlers, Volker Sprengel, Frauke PLoS One Research Article SUMMARY: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer. AVAILABILITY: The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7. Public Library of Science 2012-08-21 /pmc/articles/PMC3424124/ /pubmed/22927906 http://dx.doi.org/10.1371/journal.pone.0041224 Text en © 2012 Davenport et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Davenport, Colin F.
Neugebauer, Jens
Beckmann, Nils
Friedrich, Benedikt
Kameri, Burim
Kokott, Svea
Paetow, Malte
Siekmann, Björn
Wieding-Drewes, Matthias
Wienhöfer, Markus
Wolf, Stefan
Tümmler, Burkhard
Ahlers, Volker
Sprengel, Frauke
Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads
title Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads
title_full Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads
title_fullStr Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads
title_full_unstemmed Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads
title_short Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads
title_sort genometa - a fast and accurate classifier for short metagenomic shotgun reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424124/
https://www.ncbi.nlm.nih.gov/pubmed/22927906
http://dx.doi.org/10.1371/journal.pone.0041224
work_keys_str_mv AT davenportcolinf genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT neugebauerjens genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT beckmannnils genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT friedrichbenedikt genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT kameriburim genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT kokottsvea genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT paetowmalte genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT siekmannbjorn genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT wiedingdrewesmatthias genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT wienhofermarkus genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT wolfstefan genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT tummlerburkhard genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT ahlersvolker genometaafastandaccurateclassifierforshortmetagenomicshotgunreads
AT sprengelfrauke genometaafastandaccurateclassifierforshortmetagenomicshotgunreads