Cargando…

Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers

Nitrogen fixation, a distinct process incorporating the inactive atmospheric nitrogen into the active biological processes, has been a major topic in biological and geochemical studies. Currently, insights into diversity and distribution of nitrogen-fixing microbes are dependent upon homology-based...

Descripción completa

Detalles Bibliográficos
Autores principales: Mise, Kazumori, Masuda, Yoko, Senoo, Keishi, Itoh, Hideomi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8597730/
https://www.ncbi.nlm.nih.gov/pubmed/34787447
http://dx.doi.org/10.1128/msphere.00785-21
_version_ 1784600661901443072
author Mise, Kazumori
Masuda, Yoko
Senoo, Keishi
Itoh, Hideomi
author_facet Mise, Kazumori
Masuda, Yoko
Senoo, Keishi
Itoh, Hideomi
author_sort Mise, Kazumori
collection PubMed
description Nitrogen fixation, a distinct process incorporating the inactive atmospheric nitrogen into the active biological processes, has been a major topic in biological and geochemical studies. Currently, insights into diversity and distribution of nitrogen-fixing microbes are dependent upon homology-based analyses of nitrogenase genes, especially the nifH gene, which are broadly conserved in nitrogen-fixing microbes. Here, we report the pitfall of using nifH as a marker of microbial nitrogen fixation. We exhaustively analyzed genomes in RefSeq (231,908 genomes) and KEGG (6,509 genomes) and cooccurrence and gene order patterns of nitrogenase genes (including nifH) therein. Up to 20% of nifH-harboring genomes lacked nifD and nifK, which encode essential subunits of nitrogenase, within 10 coding sequences upstream or downstream of nifH or on the same genome. According to a phenotypic database of prokaryotes, no species and strains harboring only nifH possess nitrogen-fixing activities, which shows that these nifH genes are “pseudo”-nifH genes. Pseudo-nifH sequences mainly belong to anaerobic microbes, including members of the class Clostridia and methanogens. We also detected many pseudo-nifH reads from metagenomic sequences of anaerobic environments such as animal guts, wastewater, paddy soils, and sediments. In some samples, pseudo-nifH overwhelmed the number of “true” nifH reads by 50% or 10 times. Because of the high sequence similarity between pseudo- and true-nifH, pronounced amounts of nifH-like reads were not confidently classified. Overall, our results encourage reconsideration of the conventional use of nifH for detecting nitrogen-fixing microbes, while suggesting that nifD or nifK would be a more reliable marker. IMPORTANCE Nitrogen-fixing microbes affect biogeochemical cycling, agricultural productivity, and microbial ecosystems, and their distributions have been investigated intensively using genomic and metagenomic sequencing. Currently, insights into nitrogen fixers in the environment have been acquired by homology searches against nitrogenase genes, particularly the nifH gene, in public databases. Here, we report that public databases include a significant amount of incorrectly annotated nifH sequences (pseudo-nifH). We exhaustively investigated the genomic structures of nifH-harboring genomes and found hundreds of pseudo-nifH sequences in RefSeq and KEGG. Over half of these pseudo-nifH sequences belonged to members of the class Clostridia, which is supposed to be a prominent nitrogen-fixing clade. We also found that the abundance of nitrogen fixers in metagenomes could be overestimated by 1.5 to >10 times due to pseudo-nifH recorded in public databases. Our results encourage reconsideration of the prevalent use of nifH as a marker of nitrogen-fixing microbes.
format Online
Article
Text
id pubmed-8597730
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-85977302021-11-29 Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers Mise, Kazumori Masuda, Yoko Senoo, Keishi Itoh, Hideomi mSphere Research Article Nitrogen fixation, a distinct process incorporating the inactive atmospheric nitrogen into the active biological processes, has been a major topic in biological and geochemical studies. Currently, insights into diversity and distribution of nitrogen-fixing microbes are dependent upon homology-based analyses of nitrogenase genes, especially the nifH gene, which are broadly conserved in nitrogen-fixing microbes. Here, we report the pitfall of using nifH as a marker of microbial nitrogen fixation. We exhaustively analyzed genomes in RefSeq (231,908 genomes) and KEGG (6,509 genomes) and cooccurrence and gene order patterns of nitrogenase genes (including nifH) therein. Up to 20% of nifH-harboring genomes lacked nifD and nifK, which encode essential subunits of nitrogenase, within 10 coding sequences upstream or downstream of nifH or on the same genome. According to a phenotypic database of prokaryotes, no species and strains harboring only nifH possess nitrogen-fixing activities, which shows that these nifH genes are “pseudo”-nifH genes. Pseudo-nifH sequences mainly belong to anaerobic microbes, including members of the class Clostridia and methanogens. We also detected many pseudo-nifH reads from metagenomic sequences of anaerobic environments such as animal guts, wastewater, paddy soils, and sediments. In some samples, pseudo-nifH overwhelmed the number of “true” nifH reads by 50% or 10 times. Because of the high sequence similarity between pseudo- and true-nifH, pronounced amounts of nifH-like reads were not confidently classified. Overall, our results encourage reconsideration of the conventional use of nifH for detecting nitrogen-fixing microbes, while suggesting that nifD or nifK would be a more reliable marker. IMPORTANCE Nitrogen-fixing microbes affect biogeochemical cycling, agricultural productivity, and microbial ecosystems, and their distributions have been investigated intensively using genomic and metagenomic sequencing. Currently, insights into nitrogen fixers in the environment have been acquired by homology searches against nitrogenase genes, particularly the nifH gene, in public databases. Here, we report that public databases include a significant amount of incorrectly annotated nifH sequences (pseudo-nifH). We exhaustively investigated the genomic structures of nifH-harboring genomes and found hundreds of pseudo-nifH sequences in RefSeq and KEGG. Over half of these pseudo-nifH sequences belonged to members of the class Clostridia, which is supposed to be a prominent nitrogen-fixing clade. We also found that the abundance of nitrogen fixers in metagenomes could be overestimated by 1.5 to >10 times due to pseudo-nifH recorded in public databases. Our results encourage reconsideration of the prevalent use of nifH as a marker of nitrogen-fixing microbes. American Society for Microbiology 2021-11-17 /pmc/articles/PMC8597730/ /pubmed/34787447 http://dx.doi.org/10.1128/msphere.00785-21 Text en Copyright © 2021 Mise et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Mise, Kazumori
Masuda, Yoko
Senoo, Keishi
Itoh, Hideomi
Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers
title Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers
title_full Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers
title_fullStr Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers
title_full_unstemmed Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers
title_short Undervalued Pseudo-nifH Sequences in Public Databases Distort Metagenomic Insights into Biological Nitrogen Fixers
title_sort undervalued pseudo-nifh sequences in public databases distort metagenomic insights into biological nitrogen fixers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8597730/
https://www.ncbi.nlm.nih.gov/pubmed/34787447
http://dx.doi.org/10.1128/msphere.00785-21
work_keys_str_mv AT misekazumori undervaluedpseudonifhsequencesinpublicdatabasesdistortmetagenomicinsightsintobiologicalnitrogenfixers
AT masudayoko undervaluedpseudonifhsequencesinpublicdatabasesdistortmetagenomicinsightsintobiologicalnitrogenfixers
AT senookeishi undervaluedpseudonifhsequencesinpublicdatabasesdistortmetagenomicinsightsintobiologicalnitrogenfixers
AT itohhideomi undervaluedpseudonifhsequencesinpublicdatabasesdistortmetagenomicinsightsintobiologicalnitrogenfixers