Cargando…
A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks
Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Me...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5379509/ https://www.ncbi.nlm.nih.gov/pubmed/25620466 http://dx.doi.org/10.1038/srep08034 |
_version_ | 1782519620597448704 |
---|---|
author | Mei, Suyu Zhu, Hao |
author_facet | Mei, Suyu Zhu, Hao |
author_sort | Mei, Suyu |
collection | PubMed |
description | Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus. |
format | Online Article Text |
id | pubmed-5379509 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-53795092017-04-11 A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks Mei, Suyu Zhu, Hao Sci Rep Article Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus. Nature Publishing Group 2015-01-26 /pmc/articles/PMC5379509/ /pubmed/25620466 http://dx.doi.org/10.1038/srep08034 Text en Copyright © 2015, Macmillan Publishers Limited. All rights reserved http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Mei, Suyu Zhu, Hao A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks |
title | A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks |
title_full | A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks |
title_fullStr | A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks |
title_full_unstemmed | A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks |
title_short | A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks |
title_sort | novel one-class svm based negative data sampling method for reconstructing proteome-wide htlv-human protein interaction networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5379509/ https://www.ncbi.nlm.nih.gov/pubmed/25620466 http://dx.doi.org/10.1038/srep08034 |
work_keys_str_mv | AT meisuyu anoveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks AT zhuhao anoveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks AT meisuyu noveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks AT zhuhao noveloneclasssvmbasednegativedatasamplingmethodforreconstructingproteomewidehtlvhumanproteininteractionnetworks |