Cargando…

High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome

Secreted proteins are widely spread in living organisms and cells. Since secreted proteins are easy to be detected in body fluids, urine, and saliva in clinical diagnosis, they play important roles in biomarkers for disease diagnosis and vaccine production. In this study, we propose a novel predicto...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jian, Chai, Haiting, Guo, Song, Guo, Huaping, Li, Yanling
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6099666/
https://www.ncbi.nlm.nih.gov/pubmed/29903999
http://dx.doi.org/10.3390/molecules23061448
_version_ 1783348718392573952
author Zhang, Jian
Chai, Haiting
Guo, Song
Guo, Huaping
Li, Yanling
author_facet Zhang, Jian
Chai, Haiting
Guo, Song
Guo, Huaping
Li, Yanling
author_sort Zhang, Jian
collection PubMed
description Secreted proteins are widely spread in living organisms and cells. Since secreted proteins are easy to be detected in body fluids, urine, and saliva in clinical diagnosis, they play important roles in biomarkers for disease diagnosis and vaccine production. In this study, we propose a novel predictor for accurate high-throughput identification of mammalian secreted proteins that is based on sequence-derived features. We combine the features of amino acid composition, sequence motifs, and physicochemical properties to encode collected proteins. Detailed feature analyses prove the effectiveness of the considered features. Based on the differences across various species of secreted proteins, we introduce the species-specific scheme, which is expected to further explore the intrinsic attributes of specific secreted proteins. Experiments on benchmark datasets prove the effectiveness of our proposed method. The test on independent testing dataset also promises a good generalization capability. When compared with the traditional universal model, we experimentally demonstrate that the species-specific scheme is capable of significantly improving the prediction performance. We use our method to make predictions on unreviewed human proteome, and find 272 potential secreted proteins with probabilities that are higher than 99%. A user-friendly web server, named iMSPs (identification of Mammalian Secreted Proteins), which implements our proposed method, is designed and is available for free for academic use at: http://www.inforstation.com/webservers/iMSP/.
format Online
Article
Text
id pubmed-6099666
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-60996662018-11-13 High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome Zhang, Jian Chai, Haiting Guo, Song Guo, Huaping Li, Yanling Molecules Article Secreted proteins are widely spread in living organisms and cells. Since secreted proteins are easy to be detected in body fluids, urine, and saliva in clinical diagnosis, they play important roles in biomarkers for disease diagnosis and vaccine production. In this study, we propose a novel predictor for accurate high-throughput identification of mammalian secreted proteins that is based on sequence-derived features. We combine the features of amino acid composition, sequence motifs, and physicochemical properties to encode collected proteins. Detailed feature analyses prove the effectiveness of the considered features. Based on the differences across various species of secreted proteins, we introduce the species-specific scheme, which is expected to further explore the intrinsic attributes of specific secreted proteins. Experiments on benchmark datasets prove the effectiveness of our proposed method. The test on independent testing dataset also promises a good generalization capability. When compared with the traditional universal model, we experimentally demonstrate that the species-specific scheme is capable of significantly improving the prediction performance. We use our method to make predictions on unreviewed human proteome, and find 272 potential secreted proteins with probabilities that are higher than 99%. A user-friendly web server, named iMSPs (identification of Mammalian Secreted Proteins), which implements our proposed method, is designed and is available for free for academic use at: http://www.inforstation.com/webservers/iMSP/. MDPI 2018-06-14 /pmc/articles/PMC6099666/ /pubmed/29903999 http://dx.doi.org/10.3390/molecules23061448 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Jian
Chai, Haiting
Guo, Song
Guo, Huaping
Li, Yanling
High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome
title High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome
title_full High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome
title_fullStr High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome
title_full_unstemmed High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome
title_short High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome
title_sort high-throughput identification of mammalian secreted proteins using species-specific scheme and application to human proteome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6099666/
https://www.ncbi.nlm.nih.gov/pubmed/29903999
http://dx.doi.org/10.3390/molecules23061448
work_keys_str_mv AT zhangjian highthroughputidentificationofmammaliansecretedproteinsusingspeciesspecificschemeandapplicationtohumanproteome
AT chaihaiting highthroughputidentificationofmammaliansecretedproteinsusingspeciesspecificschemeandapplicationtohumanproteome
AT guosong highthroughputidentificationofmammaliansecretedproteinsusingspeciesspecificschemeandapplicationtohumanproteome
AT guohuaping highthroughputidentificationofmammaliansecretedproteinsusingspeciesspecificschemeandapplicationtohumanproteome
AT liyanling highthroughputidentificationofmammaliansecretedproteinsusingspeciesspecificschemeandapplicationtohumanproteome