Cargando…

Big genomics and clinical data analytics strategies for precision cancer prognosis

The field of personalized and precise medicine in the era of big data analytics is growing rapidly. Previously, we proposed our model of patient classification termed Prognostic Signature Vector Matching (PSVM) and identified a 37 variable signature comprising 36 let-7b associated prognostic signifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Ow, Ghim Siong, Kuznetsov, Vladimir A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5098145/
https://www.ncbi.nlm.nih.gov/pubmed/27819294
http://dx.doi.org/10.1038/srep36493
_version_ 1782465726104207360
author Ow, Ghim Siong
Kuznetsov, Vladimir A.
author_facet Ow, Ghim Siong
Kuznetsov, Vladimir A.
author_sort Ow, Ghim Siong
collection PubMed
description The field of personalized and precise medicine in the era of big data analytics is growing rapidly. Previously, we proposed our model of patient classification termed Prognostic Signature Vector Matching (PSVM) and identified a 37 variable signature comprising 36 let-7b associated prognostic significant mRNAs and the age risk factor that stratified large high-grade serous ovarian cancer patient cohorts into three survival-significant risk groups. Here, we investigated the predictive performance of PSVM via optimization of the prognostic variable weights, which represent the relative importance of one prognostic variable over the others. In addition, we compared several multivariate prognostic models based on PSVM with classical machine learning techniques such as K-nearest-neighbor, support vector machine, random forest, neural networks and logistic regression. Our results revealed that negative log-rank p-values provides more robust weight values as opposed to the use of other quantities such as hazard ratios, fold change, or a combination of those factors. PSVM, together with the classical machine learning classifiers were combined in an ensemble (multi-test) voting system, which collectively provides a more precise and reproducible patient stratification. The use of the multi-test system approach, rather than the search for the ideal classification/prediction method, might help to address limitations of the individual classification algorithm in specific situation.
format Online
Article
Text
id pubmed-5098145
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-50981452016-11-10 Big genomics and clinical data analytics strategies for precision cancer prognosis Ow, Ghim Siong Kuznetsov, Vladimir A. Sci Rep Article The field of personalized and precise medicine in the era of big data analytics is growing rapidly. Previously, we proposed our model of patient classification termed Prognostic Signature Vector Matching (PSVM) and identified a 37 variable signature comprising 36 let-7b associated prognostic significant mRNAs and the age risk factor that stratified large high-grade serous ovarian cancer patient cohorts into three survival-significant risk groups. Here, we investigated the predictive performance of PSVM via optimization of the prognostic variable weights, which represent the relative importance of one prognostic variable over the others. In addition, we compared several multivariate prognostic models based on PSVM with classical machine learning techniques such as K-nearest-neighbor, support vector machine, random forest, neural networks and logistic regression. Our results revealed that negative log-rank p-values provides more robust weight values as opposed to the use of other quantities such as hazard ratios, fold change, or a combination of those factors. PSVM, together with the classical machine learning classifiers were combined in an ensemble (multi-test) voting system, which collectively provides a more precise and reproducible patient stratification. The use of the multi-test system approach, rather than the search for the ideal classification/prediction method, might help to address limitations of the individual classification algorithm in specific situation. Nature Publishing Group 2016-11-07 /pmc/articles/PMC5098145/ /pubmed/27819294 http://dx.doi.org/10.1038/srep36493 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Ow, Ghim Siong
Kuznetsov, Vladimir A.
Big genomics and clinical data analytics strategies for precision cancer prognosis
title Big genomics and clinical data analytics strategies for precision cancer prognosis
title_full Big genomics and clinical data analytics strategies for precision cancer prognosis
title_fullStr Big genomics and clinical data analytics strategies for precision cancer prognosis
title_full_unstemmed Big genomics and clinical data analytics strategies for precision cancer prognosis
title_short Big genomics and clinical data analytics strategies for precision cancer prognosis
title_sort big genomics and clinical data analytics strategies for precision cancer prognosis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5098145/
https://www.ncbi.nlm.nih.gov/pubmed/27819294
http://dx.doi.org/10.1038/srep36493
work_keys_str_mv AT owghimsiong biggenomicsandclinicaldataanalyticsstrategiesforprecisioncancerprognosis
AT kuznetsovvladimira biggenomicsandclinicaldataanalyticsstrategiesforprecisioncancerprognosis