Cargando…

NIFtHool: an informatics program for identification of NifH proteins using deep neural networks

Atmospheric nitrogen fixation carried out by microorganisms has environmental and industrial importance, related to the increase of soil fertility and productivity. The present work proposes the development of a new high precision system that allows the recognition of amino acid sequences of the nit...

Descripción completa

Detalles Bibliográficos
Autores principales: Suquilanda-Pesántez, Jefferson Daniel, Aguiar Salazar, Evelyn Dayana, Almeida-Galárraga, Diego, Salum, Graciela, Villalba-Meneses, Fernando, Gudiño Gomezjurado, Marco Esteban
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8956849/
https://www.ncbi.nlm.nih.gov/pubmed/35360826
http://dx.doi.org/10.12688/f1000research.107925.1
_version_ 1784676643495739392
author Suquilanda-Pesántez, Jefferson Daniel
Aguiar Salazar, Evelyn Dayana
Almeida-Galárraga, Diego
Salum, Graciela
Villalba-Meneses, Fernando
Gudiño Gomezjurado, Marco Esteban
author_facet Suquilanda-Pesántez, Jefferson Daniel
Aguiar Salazar, Evelyn Dayana
Almeida-Galárraga, Diego
Salum, Graciela
Villalba-Meneses, Fernando
Gudiño Gomezjurado, Marco Esteban
author_sort Suquilanda-Pesántez, Jefferson Daniel
collection PubMed
description Atmospheric nitrogen fixation carried out by microorganisms has environmental and industrial importance, related to the increase of soil fertility and productivity. The present work proposes the development of a new high precision system that allows the recognition of amino acid sequences of the nitrogenase enzyme (NifH) as a promising way to improve the identification of diazotrophic bacteria. For this purpose, a database obtained from UniProt built a processed dataset formed by a set of 4911 and 4782 amino acid sequences of the NifH and non-NifH proteins respectively. Subsequently, the feature extraction was developed using two methodologies: (i) k-mers counting and (ii) embedding layers to obtain numerical vectors of the amino acid chains. Afterward, for the embedding layer, the data was crossed by an external trainable convolutional layer, which received a uniform matrix and applied convolution using filters to obtain the feature maps of the model. Finally, a deep neural network was used as the primary model to classify the amino acid sequences as NifH protein or not. Performance evaluation experiments were carried out, and the results revealed an accuracy of 96.4%, a sensitivity of 95.2%, and a specificity of 96.7%. Therefore, an amino acid sequence-based feature extraction method that uses a neural network to detect N-fixing organisms is proposed and implemented. NIFtHool is available from: https://nifthool.anvil.app/
format Online
Article
Text
id pubmed-8956849
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-89568492022-03-30 NIFtHool: an informatics program for identification of NifH proteins using deep neural networks Suquilanda-Pesántez, Jefferson Daniel Aguiar Salazar, Evelyn Dayana Almeida-Galárraga, Diego Salum, Graciela Villalba-Meneses, Fernando Gudiño Gomezjurado, Marco Esteban F1000Res Software Tool Article Atmospheric nitrogen fixation carried out by microorganisms has environmental and industrial importance, related to the increase of soil fertility and productivity. The present work proposes the development of a new high precision system that allows the recognition of amino acid sequences of the nitrogenase enzyme (NifH) as a promising way to improve the identification of diazotrophic bacteria. For this purpose, a database obtained from UniProt built a processed dataset formed by a set of 4911 and 4782 amino acid sequences of the NifH and non-NifH proteins respectively. Subsequently, the feature extraction was developed using two methodologies: (i) k-mers counting and (ii) embedding layers to obtain numerical vectors of the amino acid chains. Afterward, for the embedding layer, the data was crossed by an external trainable convolutional layer, which received a uniform matrix and applied convolution using filters to obtain the feature maps of the model. Finally, a deep neural network was used as the primary model to classify the amino acid sequences as NifH protein or not. Performance evaluation experiments were carried out, and the results revealed an accuracy of 96.4%, a sensitivity of 95.2%, and a specificity of 96.7%. Therefore, an amino acid sequence-based feature extraction method that uses a neural network to detect N-fixing organisms is proposed and implemented. NIFtHool is available from: https://nifthool.anvil.app/ F1000 Research Limited 2022-02-09 /pmc/articles/PMC8956849/ /pubmed/35360826 http://dx.doi.org/10.12688/f1000research.107925.1 Text en Copyright: © 2022 Suquilanda-Pesántez JD et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Suquilanda-Pesántez, Jefferson Daniel
Aguiar Salazar, Evelyn Dayana
Almeida-Galárraga, Diego
Salum, Graciela
Villalba-Meneses, Fernando
Gudiño Gomezjurado, Marco Esteban
NIFtHool: an informatics program for identification of NifH proteins using deep neural networks
title NIFtHool: an informatics program for identification of NifH proteins using deep neural networks
title_full NIFtHool: an informatics program for identification of NifH proteins using deep neural networks
title_fullStr NIFtHool: an informatics program for identification of NifH proteins using deep neural networks
title_full_unstemmed NIFtHool: an informatics program for identification of NifH proteins using deep neural networks
title_short NIFtHool: an informatics program for identification of NifH proteins using deep neural networks
title_sort nifthool: an informatics program for identification of nifh proteins using deep neural networks
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8956849/
https://www.ncbi.nlm.nih.gov/pubmed/35360826
http://dx.doi.org/10.12688/f1000research.107925.1
work_keys_str_mv AT suquilandapesantezjeffersondaniel nifthoolaninformaticsprogramforidentificationofnifhproteinsusingdeepneuralnetworks
AT aguiarsalazarevelyndayana nifthoolaninformaticsprogramforidentificationofnifhproteinsusingdeepneuralnetworks
AT almeidagalarragadiego nifthoolaninformaticsprogramforidentificationofnifhproteinsusingdeepneuralnetworks
AT salumgraciela nifthoolaninformaticsprogramforidentificationofnifhproteinsusingdeepneuralnetworks
AT villalbamenesesfernando nifthoolaninformaticsprogramforidentificationofnifhproteinsusingdeepneuralnetworks
AT gudinogomezjuradomarcoesteban nifthoolaninformaticsprogramforidentificationofnifhproteinsusingdeepneuralnetworks