Cargando…

PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data

Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i...

Descripción completa

Detalles Bibliográficos
Autores principales: El-Manzalawy, Yasser, Munoz, Elyse E., Lindner, Scott E., Honavar, Vasant
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5600274/
https://www.ncbi.nlm.nih.gov/pubmed/27714937
http://dx.doi.org/10.1002/pmic.201600249
_version_ 1783264214863839232
author El-Manzalawy, Yasser
Munoz, Elyse E.
Lindner, Scott E.
Honavar, Vasant
author_facet El-Manzalawy, Yasser
Munoz, Elyse E.
Lindner, Scott E.
Honavar, Vasant
author_sort El-Manzalawy, Yasser
collection PubMed
description Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.
format Online
Article
Text
id pubmed-5600274
institution National Center for Biotechnology Information
language English
publishDate 2016
record_format MEDLINE/PubMed
spelling pubmed-56002742017-09-15 PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data El-Manzalawy, Yasser Munoz, Elyse E. Lindner, Scott E. Honavar, Vasant Proteomics Article Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies. 2016-11-21 2016-12 /pmc/articles/PMC5600274/ /pubmed/27714937 http://dx.doi.org/10.1002/pmic.201600249 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
spellingShingle Article
El-Manzalawy, Yasser
Munoz, Elyse E.
Lindner, Scott E.
Honavar, Vasant
PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
title PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
title_full PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
title_fullStr PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
title_full_unstemmed PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
title_short PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
title_sort plasmosep: predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5600274/
https://www.ncbi.nlm.nih.gov/pubmed/27714937
http://dx.doi.org/10.1002/pmic.201600249
work_keys_str_mv AT elmanzalawyyasser plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata
AT munozelysee plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata
AT lindnerscotte plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata
AT honavarvasant plasmoseppredictingsurfaceexposedproteinsonthemalariaparasiteusingsemisupervisedselftrainingandexpertannotateddata