Cargando…

nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine

As inorganic nitrogen compounds are essential for basic building blocks of life (e.g., nucleotides and amino acids), the role of biological nitrogen-fixation (BNF) is indispensible. All nitrogen fixing microbes rely on the same nitrogenase enzyme for nitrogen reduction, which is in fact an enzyme co...

Descripción completa

Detalles Bibliográficos
Autores principales: Meher, Prabina K., Sahu, Tanmaya K., Mohanty, Jyotilipsa, Gahoi, Shachi, Purru, Supriya, Grover, Monendra, Rao, Atmakuri R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986947/
https://www.ncbi.nlm.nih.gov/pubmed/29896173
http://dx.doi.org/10.3389/fmicb.2018.01100
_version_ 1783329022086742016
author Meher, Prabina K.
Sahu, Tanmaya K.
Mohanty, Jyotilipsa
Gahoi, Shachi
Purru, Supriya
Grover, Monendra
Rao, Atmakuri R.
author_facet Meher, Prabina K.
Sahu, Tanmaya K.
Mohanty, Jyotilipsa
Gahoi, Shachi
Purru, Supriya
Grover, Monendra
Rao, Atmakuri R.
author_sort Meher, Prabina K.
collection PubMed
description As inorganic nitrogen compounds are essential for basic building blocks of life (e.g., nucleotides and amino acids), the role of biological nitrogen-fixation (BNF) is indispensible. All nitrogen fixing microbes rely on the same nitrogenase enzyme for nitrogen reduction, which is in fact an enzyme complex consists of as many as 20 genes. However, the occurrence of six genes viz., nifB, nifD, nifE, nifH, nifK, and nifN has been proposed to be essential for a functional nitrogenase enzyme. Therefore, identification of these genes is important to understand the mechanism of BNF as well as to explore the possibilities for improving BNF from agricultural sustainability point of view. Further, though the computational tools are available for the annotation and phylogenetic analysis of nifH gene sequences alone, to the best of our knowledge no tool is available for the computational prediction of the above mentioned six categories of nitrogen-fixation (nif) genes or proteins. Thus, we proposed an approach, which is first of its kind for the computational identification of nif proteins encoded by the six categories of nif genes. Sequence-derived features were employed to map the input sequences into vectors of numeric observations that were subsequently fed to the support vector machine as input. Two types of classifier were constructed: (i) a binary classifier for classification of nif and non-nitrogen-fixation (non-nif) proteins, and (ii) a multi-class classifier for classification of six categories of nif proteins. Higher accuracies were observed for the combination of composition-transition-distribution (CTD) feature set and radial kernel, as compared to the other feature-kernel combinations. The overall accuracies were observed >90% in both binary and multi-class classifications. The developed approach further achieved >92% accuracy, while evaluated with blind (independent) test datasets. The developed approach also produced higher accuracy in identifying nif proteins, while evaluated using proteome-wide datasets of several species. Furthermore, we established a prediction server nifPred (http://webapp.cabgrid.res.in/nifPred) to assist the scientific community for proteome-wide identification of six categories of nif proteins. Besides, the source code of nifPred is also available at https://github.com/PrabinaMeher/nifPred. The developed web server is expected to supplement the transcriptional profiling and comparative genomics studies for the identification and functional annotation of genes related to BNF.
format Online
Article
Text
id pubmed-5986947
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-59869472018-06-12 nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine Meher, Prabina K. Sahu, Tanmaya K. Mohanty, Jyotilipsa Gahoi, Shachi Purru, Supriya Grover, Monendra Rao, Atmakuri R. Front Microbiol Microbiology As inorganic nitrogen compounds are essential for basic building blocks of life (e.g., nucleotides and amino acids), the role of biological nitrogen-fixation (BNF) is indispensible. All nitrogen fixing microbes rely on the same nitrogenase enzyme for nitrogen reduction, which is in fact an enzyme complex consists of as many as 20 genes. However, the occurrence of six genes viz., nifB, nifD, nifE, nifH, nifK, and nifN has been proposed to be essential for a functional nitrogenase enzyme. Therefore, identification of these genes is important to understand the mechanism of BNF as well as to explore the possibilities for improving BNF from agricultural sustainability point of view. Further, though the computational tools are available for the annotation and phylogenetic analysis of nifH gene sequences alone, to the best of our knowledge no tool is available for the computational prediction of the above mentioned six categories of nitrogen-fixation (nif) genes or proteins. Thus, we proposed an approach, which is first of its kind for the computational identification of nif proteins encoded by the six categories of nif genes. Sequence-derived features were employed to map the input sequences into vectors of numeric observations that were subsequently fed to the support vector machine as input. Two types of classifier were constructed: (i) a binary classifier for classification of nif and non-nitrogen-fixation (non-nif) proteins, and (ii) a multi-class classifier for classification of six categories of nif proteins. Higher accuracies were observed for the combination of composition-transition-distribution (CTD) feature set and radial kernel, as compared to the other feature-kernel combinations. The overall accuracies were observed >90% in both binary and multi-class classifications. The developed approach further achieved >92% accuracy, while evaluated with blind (independent) test datasets. The developed approach also produced higher accuracy in identifying nif proteins, while evaluated using proteome-wide datasets of several species. Furthermore, we established a prediction server nifPred (http://webapp.cabgrid.res.in/nifPred) to assist the scientific community for proteome-wide identification of six categories of nif proteins. Besides, the source code of nifPred is also available at https://github.com/PrabinaMeher/nifPred. The developed web server is expected to supplement the transcriptional profiling and comparative genomics studies for the identification and functional annotation of genes related to BNF. Frontiers Media S.A. 2018-05-29 /pmc/articles/PMC5986947/ /pubmed/29896173 http://dx.doi.org/10.3389/fmicb.2018.01100 Text en Copyright © 2018 Meher, Sahu, Mohanty, Gahoi, Purru, Grover and Rao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Meher, Prabina K.
Sahu, Tanmaya K.
Mohanty, Jyotilipsa
Gahoi, Shachi
Purru, Supriya
Grover, Monendra
Rao, Atmakuri R.
nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine
title nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine
title_full nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine
title_fullStr nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine
title_full_unstemmed nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine
title_short nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine
title_sort nifpred: proteome-wide identification and categorization of nitrogen-fixation proteins of diaztrophs based on composition-transition-distribution features using support vector machine
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986947/
https://www.ncbi.nlm.nih.gov/pubmed/29896173
http://dx.doi.org/10.3389/fmicb.2018.01100
work_keys_str_mv AT meherprabinak nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine
AT sahutanmayak nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine
AT mohantyjyotilipsa nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine
AT gahoishachi nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine
AT purrusupriya nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine
AT grovermonendra nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine
AT raoatmakurir nifpredproteomewideidentificationandcategorizationofnitrogenfixationproteinsofdiaztrophsbasedoncompositiontransitiondistributionfeaturesusingsupportvectormachine