Cargando…

MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences

Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Chengwei, Rodriguez-R, Luis M., Konstantinidis, Konstantinos T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4005636/
https://www.ncbi.nlm.nih.gov/pubmed/24589583
http://dx.doi.org/10.1093/nar/gku169
_version_ 1782314130364956672
author Luo, Chengwei
Rodriguez-R, Luis M.
Konstantinidis, Konstantinos T.
author_facet Luo, Chengwei
Rodriguez-R, Luis M.
Konstantinidis, Konstantinos T.
author_sort Luo, Chengwei
collection PubMed
description Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it employs all genes present in an unknown sequence as classifiers, weighting each gene based on its (predetermined) classifying power at a given taxonomic level and frequency of horizontal gene transfer. MyTaxa also implements a novel classification scheme based on the genome-aggregate average amino acid identity concept to determine the degree of novelty of sequences representing uncharacterized taxa, i.e. whether they represent novel species, genera or phyla. Application of MyTaxa on in silico generated (mock) and real metagenomes of varied read length (100–2000 bp) revealed that it correctly classified at least 5% more sequences than any other tool. The analysis also showed that ∼10% of the assembled sequences from human gut metagenomes represent novel species with no sequenced representatives, several of which were highly abundant in situ such as members of the Prevotella genus. Thus, MyTaxa can find several important applications in microbial identification and diversity studies.
format Online
Article
Text
id pubmed-4005636
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40056362014-05-01 MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences Luo, Chengwei Rodriguez-R, Luis M. Konstantinidis, Konstantinos T. Nucleic Acids Res Methods Online Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it employs all genes present in an unknown sequence as classifiers, weighting each gene based on its (predetermined) classifying power at a given taxonomic level and frequency of horizontal gene transfer. MyTaxa also implements a novel classification scheme based on the genome-aggregate average amino acid identity concept to determine the degree of novelty of sequences representing uncharacterized taxa, i.e. whether they represent novel species, genera or phyla. Application of MyTaxa on in silico generated (mock) and real metagenomes of varied read length (100–2000 bp) revealed that it correctly classified at least 5% more sequences than any other tool. The analysis also showed that ∼10% of the assembled sequences from human gut metagenomes represent novel species with no sequenced representatives, several of which were highly abundant in situ such as members of the Prevotella genus. Thus, MyTaxa can find several important applications in microbial identification and diversity studies. Oxford University Press 2014-04 2014-03-03 /pmc/articles/PMC4005636/ /pubmed/24589583 http://dx.doi.org/10.1093/nar/gku169 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Luo, Chengwei
Rodriguez-R, Luis M.
Konstantinidis, Konstantinos T.
MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
title MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
title_full MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
title_fullStr MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
title_full_unstemmed MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
title_short MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
title_sort mytaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4005636/
https://www.ncbi.nlm.nih.gov/pubmed/24589583
http://dx.doi.org/10.1093/nar/gku169
work_keys_str_mv AT luochengwei mytaxaanadvancedtaxonomicclassifierforgenomicandmetagenomicsequences
AT rodriguezrluism mytaxaanadvancedtaxonomicclassifierforgenomicandmetagenomicsequences
AT konstantinidiskonstantinost mytaxaanadvancedtaxonomicclassifierforgenomicandmetagenomicsequences