Cargando…
PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5600274/ https://www.ncbi.nlm.nih.gov/pubmed/27714937 http://dx.doi.org/10.1002/pmic.201600249 |
_version_ | 1783264214863839232 |
---|---|
author | El-Manzalawy, Yasser Munoz, Elyse E. Lindner, Scott E. Honavar, Vasant |
author_facet | El-Manzalawy, Yasser Munoz, Elyse E. Lindner, Scott E. Honavar, Vasant |
author_sort | El-Manzalawy, Yasser |
collection | PubMed |
description | Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies. |
format | Online Article Text |
id | pubmed-5600274 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
record_format | MEDLINE/PubMed |
spelling | pubmed-56002742017-09-15 PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data El-Manzalawy, Yasser Munoz, Elyse E. Lindner, Scott E. Honavar, Vasant Proteomics Article Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies. 2016-11-21 2016-12 /pmc/articles/PMC5600274/ /pubmed/27714937 http://dx.doi.org/10.1002/pmic.201600249 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. |
spellingShingle | Article El-Manzalawy, Yasser Munoz, Elyse E. Lindner, Scott E. Honavar, Vasant PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
title | PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
title_full | PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
title_fullStr | PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
title_full_unstemmed | PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
title_short | PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
title_sort | plasmosep: predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5600274/ https://www.ncbi.nlm.nih.gov/pubmed/27714937 http://dx.doi.org/10.1002/pmic.201600249 |
work_keys_str_mv | AT elmanzalawyyasser plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata AT munozelysee plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata AT lindnerscotte plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata AT honavarvasant plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata |