Cargando…
COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
BACKGROUND: Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4641738/ https://www.ncbi.nlm.nih.gov/pubmed/26561344 http://dx.doi.org/10.1371/journal.pone.0142102 |
_version_ | 1782400250415153152 |
---|---|
author | Bose, Tungadri Haque, Mohammed Monzoorul Reddy, CVSK Mande, Sharmila S. |
author_facet | Bose, Tungadri Haque, Mohammed Monzoorul Reddy, CVSK Mande, Sharmila S. |
author_sort | Bose, Tungadri |
collection | PubMed |
description | BACKGROUND: Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations. RESULTS: Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER. CONCLUSION: The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research. AVAILABILITY AND IMPLEMENTATION: A Linux implementation of COGNIZER is freely available for download from the following links: http://metagenomics.atc.tcs.com/cognizer, https://metagenomics.atc.tcs.com/function/cognizer. |
format | Online Article Text |
id | pubmed-4641738 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-46417382015-11-18 COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets Bose, Tungadri Haque, Mohammed Monzoorul Reddy, CVSK Mande, Sharmila S. PLoS One Research Article BACKGROUND: Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations. RESULTS: Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER. CONCLUSION: The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research. AVAILABILITY AND IMPLEMENTATION: A Linux implementation of COGNIZER is freely available for download from the following links: http://metagenomics.atc.tcs.com/cognizer, https://metagenomics.atc.tcs.com/function/cognizer. Public Library of Science 2015-11-11 /pmc/articles/PMC4641738/ /pubmed/26561344 http://dx.doi.org/10.1371/journal.pone.0142102 Text en © 2015 Bose et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Bose, Tungadri Haque, Mohammed Monzoorul Reddy, CVSK Mande, Sharmila S. COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets |
title | COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets |
title_full | COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets |
title_fullStr | COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets |
title_full_unstemmed | COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets |
title_short | COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets |
title_sort | cognizer: a framework for functional annotation of metagenomic datasets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4641738/ https://www.ncbi.nlm.nih.gov/pubmed/26561344 http://dx.doi.org/10.1371/journal.pone.0142102 |
work_keys_str_mv | AT bosetungadri cognizeraframeworkforfunctionalannotationofmetagenomicdatasets AT haquemohammedmonzoorul cognizeraframeworkforfunctionalannotationofmetagenomicdatasets AT reddycvsk cognizeraframeworkforfunctionalannotationofmetagenomicdatasets AT mandesharmilas cognizeraframeworkforfunctionalannotationofmetagenomicdatasets |