Cargando…

Position-specific prediction of methylation sites from sequence conservation based on information theory

Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation site...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Yinan, Guo, Yanzhi, Hu, Yayun, Li, Menglong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5378888/
https://www.ncbi.nlm.nih.gov/pubmed/26202727
http://dx.doi.org/10.1038/srep12403
_version_ 1782519500299567104
author Shi, Yinan
Guo, Yanzhi
Hu, Yayun
Li, Menglong
author_facet Shi, Yinan
Guo, Yanzhi
Hu, Yayun
Li, Menglong
author_sort Shi, Yinan
collection PubMed
description Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation.
format Online
Article
Text
id pubmed-5378888
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53788882017-04-07 Position-specific prediction of methylation sites from sequence conservation based on information theory Shi, Yinan Guo, Yanzhi Hu, Yayun Li, Menglong Sci Rep Article Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation. Nature Publishing Group 2015-07-23 /pmc/articles/PMC5378888/ /pubmed/26202727 http://dx.doi.org/10.1038/srep12403 Text en Copyright © 2015, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Shi, Yinan
Guo, Yanzhi
Hu, Yayun
Li, Menglong
Position-specific prediction of methylation sites from sequence conservation based on information theory
title Position-specific prediction of methylation sites from sequence conservation based on information theory
title_full Position-specific prediction of methylation sites from sequence conservation based on information theory
title_fullStr Position-specific prediction of methylation sites from sequence conservation based on information theory
title_full_unstemmed Position-specific prediction of methylation sites from sequence conservation based on information theory
title_short Position-specific prediction of methylation sites from sequence conservation based on information theory
title_sort position-specific prediction of methylation sites from sequence conservation based on information theory
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5378888/
https://www.ncbi.nlm.nih.gov/pubmed/26202727
http://dx.doi.org/10.1038/srep12403
work_keys_str_mv AT shiyinan positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory
AT guoyanzhi positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory
AT huyayun positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory
AT limenglong positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory