Cargando…
Position-specific prediction of methylation sites from sequence conservation based on information theory
Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation site...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5378888/ https://www.ncbi.nlm.nih.gov/pubmed/26202727 http://dx.doi.org/10.1038/srep12403 |
_version_ | 1782519500299567104 |
---|---|
author | Shi, Yinan Guo, Yanzhi Hu, Yayun Li, Menglong |
author_facet | Shi, Yinan Guo, Yanzhi Hu, Yayun Li, Menglong |
author_sort | Shi, Yinan |
collection | PubMed |
description | Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation. |
format | Online Article Text |
id | pubmed-5378888 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-53788882017-04-07 Position-specific prediction of methylation sites from sequence conservation based on information theory Shi, Yinan Guo, Yanzhi Hu, Yayun Li, Menglong Sci Rep Article Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation. Nature Publishing Group 2015-07-23 /pmc/articles/PMC5378888/ /pubmed/26202727 http://dx.doi.org/10.1038/srep12403 Text en Copyright © 2015, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Shi, Yinan Guo, Yanzhi Hu, Yayun Li, Menglong Position-specific prediction of methylation sites from sequence conservation based on information theory |
title | Position-specific prediction of methylation sites from sequence conservation based on information theory |
title_full | Position-specific prediction of methylation sites from sequence conservation based on information theory |
title_fullStr | Position-specific prediction of methylation sites from sequence conservation based on information theory |
title_full_unstemmed | Position-specific prediction of methylation sites from sequence conservation based on information theory |
title_short | Position-specific prediction of methylation sites from sequence conservation based on information theory |
title_sort | position-specific prediction of methylation sites from sequence conservation based on information theory |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5378888/ https://www.ncbi.nlm.nih.gov/pubmed/26202727 http://dx.doi.org/10.1038/srep12403 |
work_keys_str_mv | AT shiyinan positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory AT guoyanzhi positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory AT huyayun positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory AT limenglong positionspecificpredictionofmethylationsitesfromsequenceconservationbasedoninformationtheory |