Cargando…

SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides

BACKGROUND: Heme binding proteins (HBPs) are metalloproteins that contain a heme ligand (an iron-porphyrin complex) as the prosthetic group. Several computational methods have been proposed to predict heme binding residues and thereby to understand the interactions between heme and its host proteins...

Descripción completa

Detalles Bibliográficos
Autores principales: Liou, Yi-Fan, Charoenkwan, Phasit, Srinivasulu, Yerukala Sathipati, Vasylenko, Tamara, Lai, Shih-Chung, Lee, Hua-Chin, Chen, Yi-Hsiung, Huang, Hui-Ling, Ho, Shinn-Ying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290654/
https://www.ncbi.nlm.nih.gov/pubmed/25522279
http://dx.doi.org/10.1186/1471-2105-15-S16-S4
_version_ 1782352281389236224
author Liou, Yi-Fan
Charoenkwan, Phasit
Srinivasulu, Yerukala Sathipati
Vasylenko, Tamara
Lai, Shih-Chung
Lee, Hua-Chin
Chen, Yi-Hsiung
Huang, Hui-Ling
Ho, Shinn-Ying
author_facet Liou, Yi-Fan
Charoenkwan, Phasit
Srinivasulu, Yerukala Sathipati
Vasylenko, Tamara
Lai, Shih-Chung
Lee, Hua-Chin
Chen, Yi-Hsiung
Huang, Hui-Ling
Ho, Shinn-Ying
author_sort Liou, Yi-Fan
collection PubMed
description BACKGROUND: Heme binding proteins (HBPs) are metalloproteins that contain a heme ligand (an iron-porphyrin complex) as the prosthetic group. Several computational methods have been proposed to predict heme binding residues and thereby to understand the interactions between heme and its host proteins. However, few in silico methods for identifying HBPs have been proposed. RESULTS: This work proposes a scoring card method (SCM) based method (named SCMHBP) for predicting and analyzing HBPs from sequences. A balanced dataset of 747 HBPs (selected using a Gene Ontology term GO:0020037) and 747 non-HBPs (selected from 91,414 putative non-HBPs) with an identity of 25% was firstly established. Consequently, a set of scores that quantified the propensity of amino acids and dipeptides to be HBPs is estimated using SCM to maximize the predictive accuracy of SCMHBP. Finally, the informative physicochemical properties of 20 amino acids are identified by utilizing the estimated propensity scores to be used to categorize HBPs. The training and mean test accuracies of SCMHBP applied to three independent test datasets are 85.90% and 71.57%, respectively. SCMHBP performs well relative to comparison with such methods as support vector machine (SVM), decision tree J48, and Bayes classifiers. The putative non-HBPs with high sequence propensity scores are potential HBPs, which can be further validated by experimental confirmation. The propensity scores of individual amino acids and dipeptides are examined to elucidate the interactions between heme and its host proteins. The following characteristics of HBPs are derived from the propensity scores: 1) aromatic side chains are important to the effectiveness of specific HBP functions; 2) a hydrophobic environment is important in the interaction between heme and binding sites; and 3) the whole HBP has low flexibility whereas the heme binding residues are relatively flexible. CONCLUSIONS: SCMHBP yields knowledge that improves our understanding of HBPs rather than merely improves the prediction accuracy in predicting HBPs.
format Online
Article
Text
id pubmed-4290654
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42906542015-01-15 SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides Liou, Yi-Fan Charoenkwan, Phasit Srinivasulu, Yerukala Sathipati Vasylenko, Tamara Lai, Shih-Chung Lee, Hua-Chin Chen, Yi-Hsiung Huang, Hui-Ling Ho, Shinn-Ying BMC Bioinformatics Research BACKGROUND: Heme binding proteins (HBPs) are metalloproteins that contain a heme ligand (an iron-porphyrin complex) as the prosthetic group. Several computational methods have been proposed to predict heme binding residues and thereby to understand the interactions between heme and its host proteins. However, few in silico methods for identifying HBPs have been proposed. RESULTS: This work proposes a scoring card method (SCM) based method (named SCMHBP) for predicting and analyzing HBPs from sequences. A balanced dataset of 747 HBPs (selected using a Gene Ontology term GO:0020037) and 747 non-HBPs (selected from 91,414 putative non-HBPs) with an identity of 25% was firstly established. Consequently, a set of scores that quantified the propensity of amino acids and dipeptides to be HBPs is estimated using SCM to maximize the predictive accuracy of SCMHBP. Finally, the informative physicochemical properties of 20 amino acids are identified by utilizing the estimated propensity scores to be used to categorize HBPs. The training and mean test accuracies of SCMHBP applied to three independent test datasets are 85.90% and 71.57%, respectively. SCMHBP performs well relative to comparison with such methods as support vector machine (SVM), decision tree J48, and Bayes classifiers. The putative non-HBPs with high sequence propensity scores are potential HBPs, which can be further validated by experimental confirmation. The propensity scores of individual amino acids and dipeptides are examined to elucidate the interactions between heme and its host proteins. The following characteristics of HBPs are derived from the propensity scores: 1) aromatic side chains are important to the effectiveness of specific HBP functions; 2) a hydrophobic environment is important in the interaction between heme and binding sites; and 3) the whole HBP has low flexibility whereas the heme binding residues are relatively flexible. CONCLUSIONS: SCMHBP yields knowledge that improves our understanding of HBPs rather than merely improves the prediction accuracy in predicting HBPs. BioMed Central 2014-12-08 /pmc/articles/PMC4290654/ /pubmed/25522279 http://dx.doi.org/10.1186/1471-2105-15-S16-S4 Text en Copyright © 2014 Liou et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Liou, Yi-Fan
Charoenkwan, Phasit
Srinivasulu, Yerukala Sathipati
Vasylenko, Tamara
Lai, Shih-Chung
Lee, Hua-Chin
Chen, Yi-Hsiung
Huang, Hui-Ling
Ho, Shinn-Ying
SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
title SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
title_full SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
title_fullStr SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
title_full_unstemmed SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
title_short SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
title_sort scmhbp: prediction and analysis of heme binding proteins using propensity scores of dipeptides
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290654/
https://www.ncbi.nlm.nih.gov/pubmed/25522279
http://dx.doi.org/10.1186/1471-2105-15-S16-S4
work_keys_str_mv AT liouyifan scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT charoenkwanphasit scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT srinivasuluyerukalasathipati scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT vasylenkotamara scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT laishihchung scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT leehuachin scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT chenyihsiung scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT huanghuiling scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides
AT hoshinnying scmhbppredictionandanalysisofhemebindingproteinsusingpropensityscoresofdipeptides