Cargando…
Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian
Lysine crotonylation (Kcr) is a type of protein post-translational modification (PTM), which plays important roles in a variety of cellular regulation and processes. Several methods have been proposed for the identification of crotonylation. However, most of these methods can predict efficiently onl...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7686339/ https://www.ncbi.nlm.nih.gov/pubmed/33235255 http://dx.doi.org/10.1038/s41598-020-77173-0 |
_version_ | 1783613313336213504 |
---|---|
author | Wang, Rulan Wang, Zhuo Wang, Hongfei Pang, Yuxuan Lee, Tzong-Yi |
author_facet | Wang, Rulan Wang, Zhuo Wang, Hongfei Pang, Yuxuan Lee, Tzong-Yi |
author_sort | Wang, Rulan |
collection | PubMed |
description | Lysine crotonylation (Kcr) is a type of protein post-translational modification (PTM), which plays important roles in a variety of cellular regulation and processes. Several methods have been proposed for the identification of crotonylation. However, most of these methods can predict efficiently only on histone or non-histone protein. Therefore, this work aims to give a more balanced performance in different species, here plant (non-histone) and mammalian (histone) are involved. SVM (support vector machine) and RF (random forest) were employed in this study. According to the results of cross-validations, the RF classifier based on EGAAC attribute achieved the best predictive performance which performs competitively good as existed methods, meanwhile more robust when dealing with imbalanced datasets. Moreover, an independent test was carried out, which compared the performance of this study and existed methods based on the same features or the same classifier. The classifiers of SVM and RF could achieve best performances with 92% sensitivity, 88% specificity, 90% accuracy, and an MCC of 0.80 in the mammalian dataset, and 77% sensitivity, 83% specificity, 70% accuracy and 0.54 MCC in a relatively small dataset of mammalian and a large-scaled plant dataset respectively. Moreover, a cross-species independent testing was also carried out in this study, which has proved the species diversity in plant and mammalian. |
format | Online Article Text |
id | pubmed-7686339 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-76863392020-11-27 Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian Wang, Rulan Wang, Zhuo Wang, Hongfei Pang, Yuxuan Lee, Tzong-Yi Sci Rep Article Lysine crotonylation (Kcr) is a type of protein post-translational modification (PTM), which plays important roles in a variety of cellular regulation and processes. Several methods have been proposed for the identification of crotonylation. However, most of these methods can predict efficiently only on histone or non-histone protein. Therefore, this work aims to give a more balanced performance in different species, here plant (non-histone) and mammalian (histone) are involved. SVM (support vector machine) and RF (random forest) were employed in this study. According to the results of cross-validations, the RF classifier based on EGAAC attribute achieved the best predictive performance which performs competitively good as existed methods, meanwhile more robust when dealing with imbalanced datasets. Moreover, an independent test was carried out, which compared the performance of this study and existed methods based on the same features or the same classifier. The classifiers of SVM and RF could achieve best performances with 92% sensitivity, 88% specificity, 90% accuracy, and an MCC of 0.80 in the mammalian dataset, and 77% sensitivity, 83% specificity, 70% accuracy and 0.54 MCC in a relatively small dataset of mammalian and a large-scaled plant dataset respectively. Moreover, a cross-species independent testing was also carried out in this study, which has proved the species diversity in plant and mammalian. Nature Publishing Group UK 2020-11-24 /pmc/articles/PMC7686339/ /pubmed/33235255 http://dx.doi.org/10.1038/s41598-020-77173-0 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Wang, Rulan Wang, Zhuo Wang, Hongfei Pang, Yuxuan Lee, Tzong-Yi Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
title | Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
title_full | Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
title_fullStr | Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
title_full_unstemmed | Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
title_short | Characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
title_sort | characterization and identification of lysine crotonylation sites based on machine learning method on both plant and mammalian |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7686339/ https://www.ncbi.nlm.nih.gov/pubmed/33235255 http://dx.doi.org/10.1038/s41598-020-77173-0 |
work_keys_str_mv | AT wangrulan characterizationandidentificationoflysinecrotonylationsitesbasedonmachinelearningmethodonbothplantandmammalian AT wangzhuo characterizationandidentificationoflysinecrotonylationsitesbasedonmachinelearningmethodonbothplantandmammalian AT wanghongfei characterizationandidentificationoflysinecrotonylationsitesbasedonmachinelearningmethodonbothplantandmammalian AT pangyuxuan characterizationandidentificationoflysinecrotonylationsitesbasedonmachinelearningmethodonbothplantandmammalian AT leetzongyi characterizationandidentificationoflysinecrotonylationsitesbasedonmachinelearningmethodonbothplantandmammalian |