Cargando…

A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks

Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Me...

Descripción completa

Detalles Bibliográficos
Autores principales: Mei, Suyu, Zhu, Hao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5379509/
https://www.ncbi.nlm.nih.gov/pubmed/25620466
http://dx.doi.org/10.1038/srep08034
_version_ 1782519620597448704
author Mei, Suyu
Zhu, Hao
author_facet Mei, Suyu
Zhu, Hao
author_sort Mei, Suyu
collection PubMed
description Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus.
format Online
Article
Text
id pubmed-5379509
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53795092017-04-11 A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks Mei, Suyu Zhu, Hao Sci Rep Article Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus. Nature Publishing Group 2015-01-26 /pmc/articles/PMC5379509/ /pubmed/25620466 http://dx.doi.org/10.1038/srep08034 Text en Copyright © 2015, Macmillan Publishers Limited. All rights reserved http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Mei, Suyu
Zhu, Hao
A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
title A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
title_full A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
title_fullStr A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
title_full_unstemmed A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
title_short A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
title_sort novel one-class svm based negative data sampling method for reconstructing proteome-wide htlv-human protein interaction networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5379509/
https://www.ncbi.nlm.nih.gov/pubmed/25620466
http://dx.doi.org/10.1038/srep08034
work_keys_str_mv AT meisuyu anoveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks
AT zhuhao anoveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks
AT meisuyu noveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks
AT zhuhao noveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks