Cargando…

Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes

Large scale next generation metagenomic sequencing of complex environmental samples paves the way for detailed analysis of nutrient cycles in ecosystems. For such an analysis, large scale unequivocal annotation is a prerequisite, which however is increasingly hampered by growing databases and analys...

Descripción completa

Detalles Bibliográficos
Autores principales: Cremers, Geert, Jetten, Mike S. M., Op den Camp, Huub J. M., Lücker, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580885/
https://www.ncbi.nlm.nih.gov/pubmed/36304333
http://dx.doi.org/10.3389/fbinf.2022.861505
_version_ 1784812493043924992
author Cremers, Geert
Jetten, Mike S. M.
Op den Camp, Huub J. M.
Lücker, Sebastian
author_facet Cremers, Geert
Jetten, Mike S. M.
Op den Camp, Huub J. M.
Lücker, Sebastian
author_sort Cremers, Geert
collection PubMed
description Large scale next generation metagenomic sequencing of complex environmental samples paves the way for detailed analysis of nutrient cycles in ecosystems. For such an analysis, large scale unequivocal annotation is a prerequisite, which however is increasingly hampered by growing databases and analysis time. Hereto, we created a hidden Markov model (HMM) database by clustering proteins according to their KEGG indexing. HMM profiles for key genes of specific metabolic pathways and nutrient cycles were organized in subsets to be able to analyze each important elemental cycle separately. An important motivation behind the clustered database was to enable a high degree of resolution for annotation, while decreasing database size and analysis time. Here, we present Metascan, a new tool that can fully annotate and analyze deeply sequenced samples with an average analysis time of 11 min per genome for a publicly available dataset containing 2,537 genomes, and 1.1 min per genome for nutrient cycle analysis of the same sample. Metascan easily detected general proteins like cytochromes and ferredoxins, and additional pmoCAB operons were identified that were overlooked in previous analyses. For a mock community, the BEACON (F1) score was 0.72–0.93 compared to the information in NCBI GenBank. In combination with the accompanying database, Metascan provides a fast and useful annotation and analysis tool, as demonstrated by our proof-of-principle analysis of a complex mock community metagenome.
format Online
Article
Text
id pubmed-9580885
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95808852022-10-26 Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes Cremers, Geert Jetten, Mike S. M. Op den Camp, Huub J. M. Lücker, Sebastian Front Bioinform Bioinformatics Large scale next generation metagenomic sequencing of complex environmental samples paves the way for detailed analysis of nutrient cycles in ecosystems. For such an analysis, large scale unequivocal annotation is a prerequisite, which however is increasingly hampered by growing databases and analysis time. Hereto, we created a hidden Markov model (HMM) database by clustering proteins according to their KEGG indexing. HMM profiles for key genes of specific metabolic pathways and nutrient cycles were organized in subsets to be able to analyze each important elemental cycle separately. An important motivation behind the clustered database was to enable a high degree of resolution for annotation, while decreasing database size and analysis time. Here, we present Metascan, a new tool that can fully annotate and analyze deeply sequenced samples with an average analysis time of 11 min per genome for a publicly available dataset containing 2,537 genomes, and 1.1 min per genome for nutrient cycle analysis of the same sample. Metascan easily detected general proteins like cytochromes and ferredoxins, and additional pmoCAB operons were identified that were overlooked in previous analyses. For a mock community, the BEACON (F1) score was 0.72–0.93 compared to the information in NCBI GenBank. In combination with the accompanying database, Metascan provides a fast and useful annotation and analysis tool, as demonstrated by our proof-of-principle analysis of a complex mock community metagenome. Frontiers Media S.A. 2022-06-22 /pmc/articles/PMC9580885/ /pubmed/36304333 http://dx.doi.org/10.3389/fbinf.2022.861505 Text en Copyright © 2022 Cremers, Jetten, Op den Camp and Lücker. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Cremers, Geert
Jetten, Mike S. M.
Op den Camp, Huub J. M.
Lücker, Sebastian
Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes
title Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes
title_full Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes
title_fullStr Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes
title_full_unstemmed Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes
title_short Metascan: METabolic Analysis, SCreening and ANnotation of Metagenomes
title_sort metascan: metabolic analysis, screening and annotation of metagenomes
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580885/
https://www.ncbi.nlm.nih.gov/pubmed/36304333
http://dx.doi.org/10.3389/fbinf.2022.861505
work_keys_str_mv AT cremersgeert metascanmetabolicanalysisscreeningandannotationofmetagenomes
AT jettenmikesm metascanmetabolicanalysisscreeningandannotationofmetagenomes
AT opdencamphuubjm metascanmetabolicanalysisscreeningandannotationofmetagenomes
AT luckersebastian metascanmetabolicanalysisscreeningandannotationofmetagenomes