Cargando…

Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning

INTRODUCTION: Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support th...

Descripción completa

Detalles Bibliográficos
Autores principales: Laios, Alexandros, Katsenou, Angeliki, Tan, Yong Sheng, Johnson, Racheal, Otify, Mohamed, Kaufmann, Angelika, Munot, Sarika, Thangavelu, Amudha, Hutson, Richard, Broadhead, Tim, Theophilou, Georgios, Nugent, David, De Jong, Diederick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8549478/
https://www.ncbi.nlm.nih.gov/pubmed/34693730
http://dx.doi.org/10.1177/10732748211044678
_version_ 1784590793664626688
author Laios, Alexandros
Katsenou, Angeliki
Tan, Yong Sheng
Johnson, Racheal
Otify, Mohamed
Kaufmann, Angelika
Munot, Sarika
Thangavelu, Amudha
Hutson, Richard
Broadhead, Tim
Theophilou, Georgios
Nugent, David
De Jong, Diederick
author_facet Laios, Alexandros
Katsenou, Angeliki
Tan, Yong Sheng
Johnson, Racheal
Otify, Mohamed
Kaufmann, Angelika
Munot, Sarika
Thangavelu, Amudha
Hutson, Richard
Broadhead, Tim
Theophilou, Georgios
Nugent, David
De Jong, Diederick
author_sort Laios, Alexandros
collection PubMed
description INTRODUCTION: Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support the feature selection for the 2-year prognostic period and compared the performance of several Machine Learning prediction algorithms for accurate 2-year prognosis estimation in advanced-stage high grade serous ovarian cancer (HGSOC) patients. METHODS: The prognosis estimation was formulated as a binary classification problem. Dataset was split into training and test cohorts with repeated random sampling until there was no significant difference (p = 0.20) between the two cohorts. A ten-fold cross-validation was applied. Various state-of-the-art supervised classifiers were used. For feature selection, in addition to the exhaustive search for the best combination of features, we used the-chi square test of independence and the MRMR method. RESULTS: Two hundred nine patients were identified. The model's mean prediction accuracy reached 73%. We demonstrated that Support-Vector-Machine and Ensemble Subspace Discriminant algorithms outperformed Logistic Regression in accuracy indices. The probability of achieving a cancer-free state was maximised with a combination of primary cytoreduction, good performance status and maximal surgical effort (AUC 0.63). Standard chemotherapy, performance status, tumour load and residual disease were consistently predictive of the mid-term overall survival (AUC 0.63–0.66). The model recall and precision were greater than 80%. CONCLUSION: Machine Learning appears to be promising for accurate prognosis estimation. Appropriate feature selection is required when building an HGSOC model for 2-year prognosis prediction. We provide evidence as to what combination of prognosticators leads to the largest impact on the HGSOC 2-year prognosis.
format Online
Article
Text
id pubmed-8549478
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-85494782021-10-28 Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning Laios, Alexandros Katsenou, Angeliki Tan, Yong Sheng Johnson, Racheal Otify, Mohamed Kaufmann, Angelika Munot, Sarika Thangavelu, Amudha Hutson, Richard Broadhead, Tim Theophilou, Georgios Nugent, David De Jong, Diederick Cancer Control Original Research Article INTRODUCTION: Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support the feature selection for the 2-year prognostic period and compared the performance of several Machine Learning prediction algorithms for accurate 2-year prognosis estimation in advanced-stage high grade serous ovarian cancer (HGSOC) patients. METHODS: The prognosis estimation was formulated as a binary classification problem. Dataset was split into training and test cohorts with repeated random sampling until there was no significant difference (p = 0.20) between the two cohorts. A ten-fold cross-validation was applied. Various state-of-the-art supervised classifiers were used. For feature selection, in addition to the exhaustive search for the best combination of features, we used the-chi square test of independence and the MRMR method. RESULTS: Two hundred nine patients were identified. The model's mean prediction accuracy reached 73%. We demonstrated that Support-Vector-Machine and Ensemble Subspace Discriminant algorithms outperformed Logistic Regression in accuracy indices. The probability of achieving a cancer-free state was maximised with a combination of primary cytoreduction, good performance status and maximal surgical effort (AUC 0.63). Standard chemotherapy, performance status, tumour load and residual disease were consistently predictive of the mid-term overall survival (AUC 0.63–0.66). The model recall and precision were greater than 80%. CONCLUSION: Machine Learning appears to be promising for accurate prognosis estimation. Appropriate feature selection is required when building an HGSOC model for 2-year prognosis prediction. We provide evidence as to what combination of prognosticators leads to the largest impact on the HGSOC 2-year prognosis. SAGE Publications 2021-10-24 /pmc/articles/PMC8549478/ /pubmed/34693730 http://dx.doi.org/10.1177/10732748211044678 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research Article
Laios, Alexandros
Katsenou, Angeliki
Tan, Yong Sheng
Johnson, Racheal
Otify, Mohamed
Kaufmann, Angelika
Munot, Sarika
Thangavelu, Amudha
Hutson, Richard
Broadhead, Tim
Theophilou, Georgios
Nugent, David
De Jong, Diederick
Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
title Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
title_full Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
title_fullStr Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
title_full_unstemmed Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
title_short Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
title_sort feature selection is critical for 2-year prognosis in advanced stage high grade serous ovarian cancer by using machine learning
topic Original Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8549478/
https://www.ncbi.nlm.nih.gov/pubmed/34693730
http://dx.doi.org/10.1177/10732748211044678
work_keys_str_mv AT laiosalexandros featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT katsenouangeliki featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT tanyongsheng featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT johnsonracheal featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT otifymohamed featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT kaufmannangelika featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT munotsarika featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT thangaveluamudha featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT hutsonrichard featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT broadheadtim featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT theophilougeorgios featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT nugentdavid featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning
AT dejongdiederick featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning