Cargando…

COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets

BACKGROUND: Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite...

Descripción completa

Detalles Bibliográficos
Autores principales: Bose, Tungadri, Haque, Mohammed Monzoorul, Reddy, CVSK, Mande, Sharmila S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4641738/
https://www.ncbi.nlm.nih.gov/pubmed/26561344
http://dx.doi.org/10.1371/journal.pone.0142102
_version_ 1782400250415153152
author Bose, Tungadri
Haque, Mohammed Monzoorul
Reddy, CVSK
Mande, Sharmila S.
author_facet Bose, Tungadri
Haque, Mohammed Monzoorul
Reddy, CVSK
Mande, Sharmila S.
author_sort Bose, Tungadri
collection PubMed
description BACKGROUND: Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations. RESULTS: Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER. CONCLUSION: The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research. AVAILABILITY AND IMPLEMENTATION: A Linux implementation of COGNIZER is freely available for download from the following links: http://metagenomics.atc.tcs.com/cognizer, https://metagenomics.atc.tcs.com/function/cognizer.
format Online
Article
Text
id pubmed-4641738
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46417382015-11-18 COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets Bose, Tungadri Haque, Mohammed Monzoorul Reddy, CVSK Mande, Sharmila S. PLoS One Research Article BACKGROUND: Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations. RESULTS: Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER. CONCLUSION: The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research. AVAILABILITY AND IMPLEMENTATION: A Linux implementation of COGNIZER is freely available for download from the following links: http://metagenomics.atc.tcs.com/cognizer, https://metagenomics.atc.tcs.com/function/cognizer. Public Library of Science 2015-11-11 /pmc/articles/PMC4641738/ /pubmed/26561344 http://dx.doi.org/10.1371/journal.pone.0142102 Text en © 2015 Bose et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Bose, Tungadri
Haque, Mohammed Monzoorul
Reddy, CVSK
Mande, Sharmila S.
COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
title COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
title_full COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
title_fullStr COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
title_full_unstemmed COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
title_short COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
title_sort cognizer: a framework for functional annotation of metagenomic datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4641738/
https://www.ncbi.nlm.nih.gov/pubmed/26561344
http://dx.doi.org/10.1371/journal.pone.0142102
work_keys_str_mv AT bosetungadri cognizeraframeworkforfunctionalannotationofmetagenomicdatasets
AT haquemohammedmonzoorul cognizeraframeworkforfunctionalannotationofmetagenomicdatasets
AT reddycvsk cognizeraframeworkforfunctionalannotationofmetagenomicdatasets
AT mandesharmilas cognizeraframeworkforfunctionalannotationofmetagenomicdatasets