Cargando…
Prediction of liquid–liquid phase separating proteins using machine learning
BACKGROUND: The liquid–liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Al...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8845408/ https://www.ncbi.nlm.nih.gov/pubmed/35168563 http://dx.doi.org/10.1186/s12859-022-04599-w |
_version_ | 1784651670921150464 |
---|---|
author | Chu, Xiaoquan Sun, Tanlin Li, Qian Xu, Youjun Zhang, Zhuqing Lai, Luhua Pei, Jianfeng |
author_facet | Chu, Xiaoquan Sun, Tanlin Li, Qian Xu, Youjun Zhang, Zhuqing Lai, Luhua Pei, Jianfeng |
author_sort | Chu, Xiaoquan |
collection | PubMed |
description | BACKGROUND: The liquid–liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS. RESULTS: Based on the PSPs collected in LLPSDB, we developed a sequence-based prediction tool for LLPS proteins (PSPredictor), which is an attempt at general purpose of PSP prediction that does not depend on specific protein types. Our method combines the componential and sequential information during the protein embedding stage, and, adopts the machine learning algorithm for final predicting. The proposed method achieves a tenfold cross-validation accuracy of 94.71%, and outperforms previously reported PSPs prediction tools. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which is accessible for prediction of potential PSPs. CONCLUSIONS: PSPredictor could identifie novel scaffold proteins for stress granules and predict PSPs candidates in the human genome for further study. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which provides valuable information for potential PSPs recognition. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04599-w. |
format | Online Article Text |
id | pubmed-8845408 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-88454082022-02-16 Prediction of liquid–liquid phase separating proteins using machine learning Chu, Xiaoquan Sun, Tanlin Li, Qian Xu, Youjun Zhang, Zhuqing Lai, Luhua Pei, Jianfeng BMC Bioinformatics Software BACKGROUND: The liquid–liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS. RESULTS: Based on the PSPs collected in LLPSDB, we developed a sequence-based prediction tool for LLPS proteins (PSPredictor), which is an attempt at general purpose of PSP prediction that does not depend on specific protein types. Our method combines the componential and sequential information during the protein embedding stage, and, adopts the machine learning algorithm for final predicting. The proposed method achieves a tenfold cross-validation accuracy of 94.71%, and outperforms previously reported PSPs prediction tools. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which is accessible for prediction of potential PSPs. CONCLUSIONS: PSPredictor could identifie novel scaffold proteins for stress granules and predict PSPs candidates in the human genome for further study. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which provides valuable information for potential PSPs recognition. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04599-w. BioMed Central 2022-02-15 /pmc/articles/PMC8845408/ /pubmed/35168563 http://dx.doi.org/10.1186/s12859-022-04599-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Chu, Xiaoquan Sun, Tanlin Li, Qian Xu, Youjun Zhang, Zhuqing Lai, Luhua Pei, Jianfeng Prediction of liquid–liquid phase separating proteins using machine learning |
title | Prediction of liquid–liquid phase separating proteins using machine learning |
title_full | Prediction of liquid–liquid phase separating proteins using machine learning |
title_fullStr | Prediction of liquid–liquid phase separating proteins using machine learning |
title_full_unstemmed | Prediction of liquid–liquid phase separating proteins using machine learning |
title_short | Prediction of liquid–liquid phase separating proteins using machine learning |
title_sort | prediction of liquid–liquid phase separating proteins using machine learning |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8845408/ https://www.ncbi.nlm.nih.gov/pubmed/35168563 http://dx.doi.org/10.1186/s12859-022-04599-w |
work_keys_str_mv | AT chuxiaoquan predictionofliquidliquidphaseseparatingproteinsusingmachinelearning AT suntanlin predictionofliquidliquidphaseseparatingproteinsusingmachinelearning AT liqian predictionofliquidliquidphaseseparatingproteinsusingmachinelearning AT xuyoujun predictionofliquidliquidphaseseparatingproteinsusingmachinelearning AT zhangzhuqing predictionofliquidliquidphaseseparatingproteinsusingmachinelearning AT lailuhua predictionofliquidliquidphaseseparatingproteinsusingmachinelearning AT peijianfeng predictionofliquidliquidphaseseparatingproteinsusingmachinelearning |