Cargando…

Prediction of liquid–liquid phase separating proteins using machine learning

BACKGROUND: The liquid–liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Al...

Descripción completa

Detalles Bibliográficos
Autores principales: Chu, Xiaoquan, Sun, Tanlin, Li, Qian, Xu, Youjun, Zhang, Zhuqing, Lai, Luhua, Pei, Jianfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8845408/
https://www.ncbi.nlm.nih.gov/pubmed/35168563
http://dx.doi.org/10.1186/s12859-022-04599-w
_version_ 1784651670921150464
author Chu, Xiaoquan
Sun, Tanlin
Li, Qian
Xu, Youjun
Zhang, Zhuqing
Lai, Luhua
Pei, Jianfeng
author_facet Chu, Xiaoquan
Sun, Tanlin
Li, Qian
Xu, Youjun
Zhang, Zhuqing
Lai, Luhua
Pei, Jianfeng
author_sort Chu, Xiaoquan
collection PubMed
description BACKGROUND: The liquid–liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS. RESULTS: Based on the PSPs collected in LLPSDB, we developed a sequence-based prediction tool for LLPS proteins (PSPredictor), which is an attempt at general purpose of PSP prediction that does not depend on specific protein types. Our method combines the componential and sequential information during the protein embedding stage, and, adopts the machine learning algorithm for final predicting. The proposed method achieves a tenfold cross-validation accuracy of 94.71%, and outperforms previously reported PSPs prediction tools. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which is accessible for prediction of potential PSPs. CONCLUSIONS: PSPredictor could identifie novel scaffold proteins for stress granules and predict PSPs candidates in the human genome for further study. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which provides valuable information for potential PSPs recognition. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04599-w.
format Online
Article
Text
id pubmed-8845408
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-88454082022-02-16 Prediction of liquid–liquid phase separating proteins using machine learning Chu, Xiaoquan Sun, Tanlin Li, Qian Xu, Youjun Zhang, Zhuqing Lai, Luhua Pei, Jianfeng BMC Bioinformatics Software BACKGROUND: The liquid–liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS. RESULTS: Based on the PSPs collected in LLPSDB, we developed a sequence-based prediction tool for LLPS proteins (PSPredictor), which is an attempt at general purpose of PSP prediction that does not depend on specific protein types. Our method combines the componential and sequential information during the protein embedding stage, and, adopts the machine learning algorithm for final predicting. The proposed method achieves a tenfold cross-validation accuracy of 94.71%, and outperforms previously reported PSPs prediction tools. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which is accessible for prediction of potential PSPs. CONCLUSIONS: PSPredictor could identifie novel scaffold proteins for stress granules and predict PSPs candidates in the human genome for further study. For further applications, we built a user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor), which provides valuable information for potential PSPs recognition. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04599-w. BioMed Central 2022-02-15 /pmc/articles/PMC8845408/ /pubmed/35168563 http://dx.doi.org/10.1186/s12859-022-04599-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Chu, Xiaoquan
Sun, Tanlin
Li, Qian
Xu, Youjun
Zhang, Zhuqing
Lai, Luhua
Pei, Jianfeng
Prediction of liquid–liquid phase separating proteins using machine learning
title Prediction of liquid–liquid phase separating proteins using machine learning
title_full Prediction of liquid–liquid phase separating proteins using machine learning
title_fullStr Prediction of liquid–liquid phase separating proteins using machine learning
title_full_unstemmed Prediction of liquid–liquid phase separating proteins using machine learning
title_short Prediction of liquid–liquid phase separating proteins using machine learning
title_sort prediction of liquid–liquid phase separating proteins using machine learning
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8845408/
https://www.ncbi.nlm.nih.gov/pubmed/35168563
http://dx.doi.org/10.1186/s12859-022-04599-w
work_keys_str_mv AT chuxiaoquan predictionofliquidliquidphaseseparatingproteinsusingmachinelearning
AT suntanlin predictionofliquidliquidphaseseparatingproteinsusingmachinelearning
AT liqian predictionofliquidliquidphaseseparatingproteinsusingmachinelearning
AT xuyoujun predictionofliquidliquidphaseseparatingproteinsusingmachinelearning
AT zhangzhuqing predictionofliquidliquidphaseseparatingproteinsusingmachinelearning
AT lailuhua predictionofliquidliquidphaseseparatingproteinsusingmachinelearning
AT peijianfeng predictionofliquidliquidphaseseparatingproteinsusingmachinelearning