Cargando…

Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences

Bioluminescent proteins (BLPs) are a class of proteins with various mechanisms of light emission such as bioluminescence and fluorescence from luminous organisms. While valuable for commercial and medical applications, identification of BLPs, including luciferases and fluorescent proteins (FPs), is...

Descripción completa

Detalles Bibliográficos
Autor principal: Huang, Hui-Ling
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020813/
https://www.ncbi.nlm.nih.gov/pubmed/24828431
http://dx.doi.org/10.1371/journal.pone.0097158
_version_ 1782316129435254784
author Huang, Hui-Ling
author_facet Huang, Hui-Ling
author_sort Huang, Hui-Ling
collection PubMed
description Bioluminescent proteins (BLPs) are a class of proteins with various mechanisms of light emission such as bioluminescence and fluorescence from luminous organisms. While valuable for commercial and medical applications, identification of BLPs, including luciferases and fluorescent proteins (FPs), is rather challenging, owing to their high variety of protein sequences. Moreover, characterization of BLPs facilitates mutagenesis analysis to enhance bioluminescence and fluorescence. Therefore, this study proposes a novel methodological approach to estimating the propensity scores of 400 dipeptides and 20 amino acids in order to design two prediction methods and characterize BLPs based on a scoring card method (SCM). The SCMBLP method for predicting BLPs achieves an accuracy of 90.83% for 10-fold cross-validation higher than existing support vector machine based methods and a test accuracy of 82.85%. A dataset consisting of 269 luciferases and 216 FPs is also established to design the SCMLFP prediction method, which achieves training and test accuracies of 97.10% and 96.28%, respectively. Additionally, four informative physicochemical properties of 20 amino acids are identified using the estimated propensity scores to characterize BLPs as follows: 1) high transfer free energy from inside to the protein surface, 2) high occurrence frequency of residues in the transmembrane regions of the protein, 3) large hydrophobicity scale from the native protein structure, and 4) high correlation coefficient (R = 0.921) between the amino acid compositions of BLPs and integral membrane proteins. Further analyzing BLPs reveals that luciferases have a larger value of R (0.937) than FPs (0.635), suggesting that luciferases tend to locate near the cell membrane location rather than FPs for convenient receipt of extracellular ions. Importantly, the propensity scores of dipeptides and amino acids and the identified properties facilitate efforts to predict, characterize, and apply BLPs, including luciferases, photoproteins, and FPs. The web server is available at http://iclab.life.nctu.edu.tw/SCMBLP/index.html.
format Online
Article
Text
id pubmed-4020813
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40208132014-05-21 Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences Huang, Hui-Ling PLoS One Research Article Bioluminescent proteins (BLPs) are a class of proteins with various mechanisms of light emission such as bioluminescence and fluorescence from luminous organisms. While valuable for commercial and medical applications, identification of BLPs, including luciferases and fluorescent proteins (FPs), is rather challenging, owing to their high variety of protein sequences. Moreover, characterization of BLPs facilitates mutagenesis analysis to enhance bioluminescence and fluorescence. Therefore, this study proposes a novel methodological approach to estimating the propensity scores of 400 dipeptides and 20 amino acids in order to design two prediction methods and characterize BLPs based on a scoring card method (SCM). The SCMBLP method for predicting BLPs achieves an accuracy of 90.83% for 10-fold cross-validation higher than existing support vector machine based methods and a test accuracy of 82.85%. A dataset consisting of 269 luciferases and 216 FPs is also established to design the SCMLFP prediction method, which achieves training and test accuracies of 97.10% and 96.28%, respectively. Additionally, four informative physicochemical properties of 20 amino acids are identified using the estimated propensity scores to characterize BLPs as follows: 1) high transfer free energy from inside to the protein surface, 2) high occurrence frequency of residues in the transmembrane regions of the protein, 3) large hydrophobicity scale from the native protein structure, and 4) high correlation coefficient (R = 0.921) between the amino acid compositions of BLPs and integral membrane proteins. Further analyzing BLPs reveals that luciferases have a larger value of R (0.937) than FPs (0.635), suggesting that luciferases tend to locate near the cell membrane location rather than FPs for convenient receipt of extracellular ions. Importantly, the propensity scores of dipeptides and amino acids and the identified properties facilitate efforts to predict, characterize, and apply BLPs, including luciferases, photoproteins, and FPs. The web server is available at http://iclab.life.nctu.edu.tw/SCMBLP/index.html. Public Library of Science 2014-05-14 /pmc/articles/PMC4020813/ /pubmed/24828431 http://dx.doi.org/10.1371/journal.pone.0097158 Text en © 2014 Hui-Ling Huang http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Huang, Hui-Ling
Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences
title Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences
title_full Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences
title_fullStr Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences
title_full_unstemmed Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences
title_short Propensity Scores for Prediction and Characterization of Bioluminescent Proteins from Sequences
title_sort propensity scores for prediction and characterization of bioluminescent proteins from sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020813/
https://www.ncbi.nlm.nih.gov/pubmed/24828431
http://dx.doi.org/10.1371/journal.pone.0097158
work_keys_str_mv AT huanghuiling propensityscoresforpredictionandcharacterizationofbioluminescentproteinsfromsequences