Cargando…

Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN

The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic...

Descripción completa

Detalles Bibliográficos
Autores principales: Dumont, Marc G., Lüke, Claudia, Deng, Yongcui, Frenzel, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3927136/
https://www.ncbi.nlm.nih.gov/pubmed/22558000
http://dx.doi.org/10.3389/fmicb.2014.00034
_version_ 1782304075622121472
author Dumont, Marc G.
Lüke, Claudia
Deng, Yongcui
Frenzel, Peter
author_facet Dumont, Marc G.
Lüke, Claudia
Deng, Yongcui
Frenzel, Peter
author_sort Dumont, Marc G.
collection PubMed
description The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level) are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA) algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty.
format Online
Article
Text
id pubmed-3927136
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-39271362014-03-05 Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN Dumont, Marc G. Lüke, Claudia Deng, Yongcui Frenzel, Peter Front Microbiol Microbiology The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level) are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA) algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty. Frontiers Media S.A. 2014-02-18 /pmc/articles/PMC3927136/ /pubmed/22558000 http://dx.doi.org/10.3389/fmicb.2014.00034 Text en Copyright © 2014 Dumont, Lüke, Deng and Frenzel. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Dumont, Marc G.
Lüke, Claudia
Deng, Yongcui
Frenzel, Peter
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_full Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_fullStr Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_full_unstemmed Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_short Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_sort classification of pmoa amplicon pyrosequences using blast and the lowest common ancestor method in megan
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3927136/
https://www.ncbi.nlm.nih.gov/pubmed/22558000
http://dx.doi.org/10.3389/fmicb.2014.00034
work_keys_str_mv AT dumontmarcg classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT lukeclaudia classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT dengyongcui classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT frenzelpeter classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan