Cargando…

NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes

Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, su...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Di, Lin, Zhengkui, Jia, Cangzhi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414792/
https://www.ncbi.nlm.nih.gov/pubmed/37576553
http://dx.doi.org/10.3389/fgene.2023.1226905
_version_ 1785087410251497472
author Liu, Di
Lin, Zhengkui
Jia, Cangzhi
author_facet Liu, Di
Lin, Zhengkui
Jia, Cangzhi
author_sort Liu, Di
collection PubMed
description Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.
format Online
Article
Text
id pubmed-10414792
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-104147922023-08-11 NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes Liu, Di Lin, Zhengkui Jia, Cangzhi Front Genet Genetics Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides. Frontiers Media S.A. 2023-07-27 /pmc/articles/PMC10414792/ /pubmed/37576553 http://dx.doi.org/10.3389/fgene.2023.1226905 Text en Copyright © 2023 Liu, Lin and Jia. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Liu, Di
Lin, Zhengkui
Jia, Cangzhi
NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
title NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
title_full NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
title_fullStr NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
title_full_unstemmed NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
title_short NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
title_sort neurocnn_gnb: an ensemble model to predict neuropeptides based on a convolution neural network and gaussian naive bayes
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414792/
https://www.ncbi.nlm.nih.gov/pubmed/37576553
http://dx.doi.org/10.3389/fgene.2023.1226905
work_keys_str_mv AT liudi neurocnngnbanensemblemodeltopredictneuropeptidesbasedonaconvolutionneuralnetworkandgaussiannaivebayes
AT linzhengkui neurocnngnbanensemblemodeltopredictneuropeptidesbasedonaconvolutionneuralnetworkandgaussiannaivebayes
AT jiacangzhi neurocnngnbanensemblemodeltopredictneuropeptidesbasedonaconvolutionneuralnetworkandgaussiannaivebayes