Cargando…

iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree

A soluble carrier growth hormone binding protein (GHBP) that can selectively and non-covalently interact with growth hormone, thereby acting as a modulator or inhibitor of growth hormone signalling. Accurate identification of the GHBP from a given protein sequence also provides important clues for u...

Descripción completa

Detalles Bibliográficos
Autores principales: Basith, Shaherin, Manavalan, Balachandran, Shin, Tae Hwan, Lee, Gwang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6222285/
https://www.ncbi.nlm.nih.gov/pubmed/30425802
http://dx.doi.org/10.1016/j.csbj.2018.10.007
_version_ 1783369170564415488
author Basith, Shaherin
Manavalan, Balachandran
Shin, Tae Hwan
Lee, Gwang
author_facet Basith, Shaherin
Manavalan, Balachandran
Shin, Tae Hwan
Lee, Gwang
author_sort Basith, Shaherin
collection PubMed
description A soluble carrier growth hormone binding protein (GHBP) that can selectively and non-covalently interact with growth hormone, thereby acting as a modulator or inhibitor of growth hormone signalling. Accurate identification of the GHBP from a given protein sequence also provides important clues for understanding cell growth and cellular mechanisms. In the postgenomic era, there has been an abundance of protein sequence data garnered, hence it is crucial to develop an automated computational method which enables fast and accurate identification of putative GHBPs within a vast number of candidate proteins. In this study, we describe a novel machine-learning-based predictor called iGHBP for the identification of GHBP. In order to predict GHBP from a given protein sequence, we trained an extremely randomised tree with an optimal feature set that was obtained from a combination of dipeptide composition and amino acid index values by applying a two-step feature selection protocol. During cross-validation analysis, iGHBP achieved an accuracy of 84.9%, which was ~7% higher than the control extremely randomised tree predictor trained with all features, thus demonstrating the effectiveness of our feature selection protocol. Furthermore, when objectively evaluated on an independent data set, our proposed iGHBP method displayed superior performance compared to the existing method. Additionally, a user-friendly web server that implements the proposed iGHBP has been established and is available at http://thegleelab.org/iGHBP.
format Online
Article
Text
id pubmed-6222285
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-62222852018-11-13 iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree Basith, Shaherin Manavalan, Balachandran Shin, Tae Hwan Lee, Gwang Comput Struct Biotechnol J Short Survey A soluble carrier growth hormone binding protein (GHBP) that can selectively and non-covalently interact with growth hormone, thereby acting as a modulator or inhibitor of growth hormone signalling. Accurate identification of the GHBP from a given protein sequence also provides important clues for understanding cell growth and cellular mechanisms. In the postgenomic era, there has been an abundance of protein sequence data garnered, hence it is crucial to develop an automated computational method which enables fast and accurate identification of putative GHBPs within a vast number of candidate proteins. In this study, we describe a novel machine-learning-based predictor called iGHBP for the identification of GHBP. In order to predict GHBP from a given protein sequence, we trained an extremely randomised tree with an optimal feature set that was obtained from a combination of dipeptide composition and amino acid index values by applying a two-step feature selection protocol. During cross-validation analysis, iGHBP achieved an accuracy of 84.9%, which was ~7% higher than the control extremely randomised tree predictor trained with all features, thus demonstrating the effectiveness of our feature selection protocol. Furthermore, when objectively evaluated on an independent data set, our proposed iGHBP method displayed superior performance compared to the existing method. Additionally, a user-friendly web server that implements the proposed iGHBP has been established and is available at http://thegleelab.org/iGHBP. Research Network of Computational and Structural Biotechnology 2018-10-24 /pmc/articles/PMC6222285/ /pubmed/30425802 http://dx.doi.org/10.1016/j.csbj.2018.10.007 Text en © 2018 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Short Survey
Basith, Shaherin
Manavalan, Balachandran
Shin, Tae Hwan
Lee, Gwang
iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
title iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
title_full iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
title_fullStr iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
title_full_unstemmed iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
title_short iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
title_sort ighbp: computational identification of growth hormone binding proteins from sequences using extremely randomised tree
topic Short Survey
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6222285/
https://www.ncbi.nlm.nih.gov/pubmed/30425802
http://dx.doi.org/10.1016/j.csbj.2018.10.007
work_keys_str_mv AT basithshaherin ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree
AT manavalanbalachandran ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree
AT shintaehwan ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree
AT leegwang ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree