Cargando…
iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree
A soluble carrier growth hormone binding protein (GHBP) that can selectively and non-covalently interact with growth hormone, thereby acting as a modulator or inhibitor of growth hormone signalling. Accurate identification of the GHBP from a given protein sequence also provides important clues for u...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6222285/ https://www.ncbi.nlm.nih.gov/pubmed/30425802 http://dx.doi.org/10.1016/j.csbj.2018.10.007 |
_version_ | 1783369170564415488 |
---|---|
author | Basith, Shaherin Manavalan, Balachandran Shin, Tae Hwan Lee, Gwang |
author_facet | Basith, Shaherin Manavalan, Balachandran Shin, Tae Hwan Lee, Gwang |
author_sort | Basith, Shaherin |
collection | PubMed |
description | A soluble carrier growth hormone binding protein (GHBP) that can selectively and non-covalently interact with growth hormone, thereby acting as a modulator or inhibitor of growth hormone signalling. Accurate identification of the GHBP from a given protein sequence also provides important clues for understanding cell growth and cellular mechanisms. In the postgenomic era, there has been an abundance of protein sequence data garnered, hence it is crucial to develop an automated computational method which enables fast and accurate identification of putative GHBPs within a vast number of candidate proteins. In this study, we describe a novel machine-learning-based predictor called iGHBP for the identification of GHBP. In order to predict GHBP from a given protein sequence, we trained an extremely randomised tree with an optimal feature set that was obtained from a combination of dipeptide composition and amino acid index values by applying a two-step feature selection protocol. During cross-validation analysis, iGHBP achieved an accuracy of 84.9%, which was ~7% higher than the control extremely randomised tree predictor trained with all features, thus demonstrating the effectiveness of our feature selection protocol. Furthermore, when objectively evaluated on an independent data set, our proposed iGHBP method displayed superior performance compared to the existing method. Additionally, a user-friendly web server that implements the proposed iGHBP has been established and is available at http://thegleelab.org/iGHBP. |
format | Online Article Text |
id | pubmed-6222285 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-62222852018-11-13 iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree Basith, Shaherin Manavalan, Balachandran Shin, Tae Hwan Lee, Gwang Comput Struct Biotechnol J Short Survey A soluble carrier growth hormone binding protein (GHBP) that can selectively and non-covalently interact with growth hormone, thereby acting as a modulator or inhibitor of growth hormone signalling. Accurate identification of the GHBP from a given protein sequence also provides important clues for understanding cell growth and cellular mechanisms. In the postgenomic era, there has been an abundance of protein sequence data garnered, hence it is crucial to develop an automated computational method which enables fast and accurate identification of putative GHBPs within a vast number of candidate proteins. In this study, we describe a novel machine-learning-based predictor called iGHBP for the identification of GHBP. In order to predict GHBP from a given protein sequence, we trained an extremely randomised tree with an optimal feature set that was obtained from a combination of dipeptide composition and amino acid index values by applying a two-step feature selection protocol. During cross-validation analysis, iGHBP achieved an accuracy of 84.9%, which was ~7% higher than the control extremely randomised tree predictor trained with all features, thus demonstrating the effectiveness of our feature selection protocol. Furthermore, when objectively evaluated on an independent data set, our proposed iGHBP method displayed superior performance compared to the existing method. Additionally, a user-friendly web server that implements the proposed iGHBP has been established and is available at http://thegleelab.org/iGHBP. Research Network of Computational and Structural Biotechnology 2018-10-24 /pmc/articles/PMC6222285/ /pubmed/30425802 http://dx.doi.org/10.1016/j.csbj.2018.10.007 Text en © 2018 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Short Survey Basith, Shaherin Manavalan, Balachandran Shin, Tae Hwan Lee, Gwang iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
title | iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
title_full | iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
title_fullStr | iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
title_full_unstemmed | iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
title_short | iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
title_sort | ighbp: computational identification of growth hormone binding proteins from sequences using extremely randomised tree |
topic | Short Survey |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6222285/ https://www.ncbi.nlm.nih.gov/pubmed/30425802 http://dx.doi.org/10.1016/j.csbj.2018.10.007 |
work_keys_str_mv | AT basithshaherin ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree AT manavalanbalachandran ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree AT shintaehwan ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree AT leegwang ighbpcomputationalidentificationofgrowthhormonebindingproteinsfromsequencesusingextremelyrandomisedtree |