Cargando…
Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine
BACKGROUND: The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588793/ https://www.ncbi.nlm.nih.gov/pubmed/28890846 http://dx.doi.org/10.7717/peerj.3561 |
_version_ | 1783262238924079104 |
---|---|
author | Kumar, Ravindra Kumari, Bandana Kumar, Manish |
author_facet | Kumar, Ravindra Kumari, Bandana Kumar, Manish |
author_sort | Kumar, Ravindra |
collection | PubMed |
description | BACKGROUND: The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins to the secretory pathway. Hence at any given point of time, endoplasmic reticulum contains two different cohorts of proteins, (i) proteins involved in endoplasmic reticulum-specific function, which reside in the lumen of the endoplasmic reticulum, called as endoplasmic reticulum resident proteins and (ii) proteins which are in process of moving to the extracellular space. Thus, endoplasmic reticulum resident proteins must somehow be distinguished from newly synthesized secretory proteins, which pass through the endoplasmic reticulum on their way out of the cell. Approximately only 50% of the proteins used in this study as training data had endoplasmic reticulum retention signal, which shows that these signals are not essentially present in all endoplasmic reticulum resident proteins. This also strongly indicates the role of additional factors in retention of endoplasmic reticulum-specific proteins inside the endoplasmic reticulum. METHODS: This is a support vector machine based method, where we had used different forms of protein features as inputs for support vector machine to develop the prediction models. During training leave-one-out approach of cross-validation was used. Maximum performance was obtained with a combination of amino acid compositions of different part of proteins. RESULTS: In this study, we have reported a novel support vector machine based method for predicting endoplasmic reticulum resident proteins, named as ERPred. During training we achieved a maximum accuracy of 81.42% with leave-one-out approach of cross-validation. When evaluated on independent dataset, ERPred did prediction with sensitivity of 72.31% and specificity of 83.69%. We have also annotated six different proteomes to predict the candidate endoplasmic reticulum resident proteins in them. A webserver, ERPred, was developed to make the method available to the scientific community, which can be accessed at http://proteininformatics.org/mkumar/erpred/index.html. DISCUSSION: We found that out of 124 proteins of the training dataset, only 66 proteins had endoplasmic reticulum retention signals, which shows that these signals are not an absolute necessity for endoplasmic reticulum resident proteins to remain inside the endoplasmic reticulum. This observation also strongly indicates the role of additional factors in retention of proteins inside the endoplasmic reticulum. Our proposed predictor, ERPred, is a signal independent tool. It is tuned for the prediction of endoplasmic reticulum resident proteins, even if the query protein does not contain specific ER-retention signal. |
format | Online Article Text |
id | pubmed-5588793 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-55887932017-09-08 Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine Kumar, Ravindra Kumari, Bandana Kumar, Manish PeerJ Bioinformatics BACKGROUND: The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins to the secretory pathway. Hence at any given point of time, endoplasmic reticulum contains two different cohorts of proteins, (i) proteins involved in endoplasmic reticulum-specific function, which reside in the lumen of the endoplasmic reticulum, called as endoplasmic reticulum resident proteins and (ii) proteins which are in process of moving to the extracellular space. Thus, endoplasmic reticulum resident proteins must somehow be distinguished from newly synthesized secretory proteins, which pass through the endoplasmic reticulum on their way out of the cell. Approximately only 50% of the proteins used in this study as training data had endoplasmic reticulum retention signal, which shows that these signals are not essentially present in all endoplasmic reticulum resident proteins. This also strongly indicates the role of additional factors in retention of endoplasmic reticulum-specific proteins inside the endoplasmic reticulum. METHODS: This is a support vector machine based method, where we had used different forms of protein features as inputs for support vector machine to develop the prediction models. During training leave-one-out approach of cross-validation was used. Maximum performance was obtained with a combination of amino acid compositions of different part of proteins. RESULTS: In this study, we have reported a novel support vector machine based method for predicting endoplasmic reticulum resident proteins, named as ERPred. During training we achieved a maximum accuracy of 81.42% with leave-one-out approach of cross-validation. When evaluated on independent dataset, ERPred did prediction with sensitivity of 72.31% and specificity of 83.69%. We have also annotated six different proteomes to predict the candidate endoplasmic reticulum resident proteins in them. A webserver, ERPred, was developed to make the method available to the scientific community, which can be accessed at http://proteininformatics.org/mkumar/erpred/index.html. DISCUSSION: We found that out of 124 proteins of the training dataset, only 66 proteins had endoplasmic reticulum retention signals, which shows that these signals are not an absolute necessity for endoplasmic reticulum resident proteins to remain inside the endoplasmic reticulum. This observation also strongly indicates the role of additional factors in retention of proteins inside the endoplasmic reticulum. Our proposed predictor, ERPred, is a signal independent tool. It is tuned for the prediction of endoplasmic reticulum resident proteins, even if the query protein does not contain specific ER-retention signal. PeerJ Inc. 2017-09-04 /pmc/articles/PMC5588793/ /pubmed/28890846 http://dx.doi.org/10.7717/peerj.3561 Text en ©2017 Kumar et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Kumar, Ravindra Kumari, Bandana Kumar, Manish Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
title | Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
title_full | Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
title_fullStr | Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
title_full_unstemmed | Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
title_short | Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
title_sort | prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588793/ https://www.ncbi.nlm.nih.gov/pubmed/28890846 http://dx.doi.org/10.7717/peerj.3561 |
work_keys_str_mv | AT kumarravindra predictionofendoplasmicreticulumresidentproteinsusingfragmentedaminoacidcompositionandsupportvectormachine AT kumaribandana predictionofendoplasmicreticulumresidentproteinsusingfragmentedaminoacidcompositionandsupportvectormachine AT kumarmanish predictionofendoplasmicreticulumresidentproteinsusingfragmentedaminoacidcompositionandsupportvectormachine |