Cargando…

Identification of stress response proteins through fusion of machine learning models and statistical paradigms

Proteins are a vital component of cells that perform physiological functions to ensure smooth operations of bodily functions. Identification of a protein's function involves a detailed understanding of the structure of proteins. Stress proteins are essential mediators of several responses to ce...

Descripción completa

Detalles Bibliográficos
Autores principales: Alzahrani, Ebraheem, Alghamdi, Wajdi, Ullah, Malik Zaka, Khan, Yaser Daanial
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8571424/
https://www.ncbi.nlm.nih.gov/pubmed/34741132
http://dx.doi.org/10.1038/s41598-021-99083-5
_version_ 1784595018828218368
author Alzahrani, Ebraheem
Alghamdi, Wajdi
Ullah, Malik Zaka
Khan, Yaser Daanial
author_facet Alzahrani, Ebraheem
Alghamdi, Wajdi
Ullah, Malik Zaka
Khan, Yaser Daanial
author_sort Alzahrani, Ebraheem
collection PubMed
description Proteins are a vital component of cells that perform physiological functions to ensure smooth operations of bodily functions. Identification of a protein's function involves a detailed understanding of the structure of proteins. Stress proteins are essential mediators of several responses to cellular stress and are categorized based on their structural characteristics. These proteins are found to be conserved across many eukaryotic and prokaryotic linkages and demonstrate varied crucial functional activities inside a cell. The in-vivo, ex vivo, and in-vitro identification of stress proteins are a time-consuming and costly task. This study is aimed at the identification of stress protein sequences with the aid of mathematical modelling and machine learning methods to supplement the aforementioned wet lab methods. The model developed using Random Forest showed remarkable results with 91.1% accuracy while models based on neural network and support vector machine showed 87.7% and 47.0% accuracy, respectively. Based on evaluation results it was concluded that random-forest based classifier surpassed all other predictors and is suitable for use in practical applications for the identification of stress proteins. Live web server is available at http://biopred.org/stressprotiens, while the webserver code available is at https://github.com/abdullah5naveed/SRP_WebServer.git
format Online
Article
Text
id pubmed-8571424
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-85714242021-11-09 Identification of stress response proteins through fusion of machine learning models and statistical paradigms Alzahrani, Ebraheem Alghamdi, Wajdi Ullah, Malik Zaka Khan, Yaser Daanial Sci Rep Article Proteins are a vital component of cells that perform physiological functions to ensure smooth operations of bodily functions. Identification of a protein's function involves a detailed understanding of the structure of proteins. Stress proteins are essential mediators of several responses to cellular stress and are categorized based on their structural characteristics. These proteins are found to be conserved across many eukaryotic and prokaryotic linkages and demonstrate varied crucial functional activities inside a cell. The in-vivo, ex vivo, and in-vitro identification of stress proteins are a time-consuming and costly task. This study is aimed at the identification of stress protein sequences with the aid of mathematical modelling and machine learning methods to supplement the aforementioned wet lab methods. The model developed using Random Forest showed remarkable results with 91.1% accuracy while models based on neural network and support vector machine showed 87.7% and 47.0% accuracy, respectively. Based on evaluation results it was concluded that random-forest based classifier surpassed all other predictors and is suitable for use in practical applications for the identification of stress proteins. Live web server is available at http://biopred.org/stressprotiens, while the webserver code available is at https://github.com/abdullah5naveed/SRP_WebServer.git Nature Publishing Group UK 2021-11-05 /pmc/articles/PMC8571424/ /pubmed/34741132 http://dx.doi.org/10.1038/s41598-021-99083-5 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Alzahrani, Ebraheem
Alghamdi, Wajdi
Ullah, Malik Zaka
Khan, Yaser Daanial
Identification of stress response proteins through fusion of machine learning models and statistical paradigms
title Identification of stress response proteins through fusion of machine learning models and statistical paradigms
title_full Identification of stress response proteins through fusion of machine learning models and statistical paradigms
title_fullStr Identification of stress response proteins through fusion of machine learning models and statistical paradigms
title_full_unstemmed Identification of stress response proteins through fusion of machine learning models and statistical paradigms
title_short Identification of stress response proteins through fusion of machine learning models and statistical paradigms
title_sort identification of stress response proteins through fusion of machine learning models and statistical paradigms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8571424/
https://www.ncbi.nlm.nih.gov/pubmed/34741132
http://dx.doi.org/10.1038/s41598-021-99083-5
work_keys_str_mv AT alzahraniebraheem identificationofstressresponseproteinsthroughfusionofmachinelearningmodelsandstatisticalparadigms
AT alghamdiwajdi identificationofstressresponseproteinsthroughfusionofmachinelearningmodelsandstatisticalparadigms
AT ullahmalikzaka identificationofstressresponseproteinsthroughfusionofmachinelearningmodelsandstatisticalparadigms
AT khanyaserdaanial identificationofstressresponseproteinsthroughfusionofmachinelearningmodelsandstatisticalparadigms