Cargando…

Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers

Ovarian cancer is one of the most common gynecological malignancies, ranking third after cervical and uterine cancer. High-grade serous ovarian cancer (HGSOC) is one of the most aggressive subtype, and the late onset of its symptoms leads in most cases to an unfavourable prognosis. Current predictiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Farinella, Federica, Merone, Mario, Bacco, Luca, Capirchio, Adriano, Ciccozzi, Massimo, Caligiore, Daniele
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8866540/
https://www.ncbi.nlm.nih.gov/pubmed/35197484
http://dx.doi.org/10.1038/s41598-022-06788-2
_version_ 1784655860212957184
author Farinella, Federica
Merone, Mario
Bacco, Luca
Capirchio, Adriano
Ciccozzi, Massimo
Caligiore, Daniele
author_facet Farinella, Federica
Merone, Mario
Bacco, Luca
Capirchio, Adriano
Ciccozzi, Massimo
Caligiore, Daniele
author_sort Farinella, Federica
collection PubMed
description Ovarian cancer is one of the most common gynecological malignancies, ranking third after cervical and uterine cancer. High-grade serous ovarian cancer (HGSOC) is one of the most aggressive subtype, and the late onset of its symptoms leads in most cases to an unfavourable prognosis. Current predictive algorithms used to estimate the risk of having Ovarian Cancer fail to provide sufficient sensitivity and specificity to be used widely in clinical practice. The use of additional biomarkers or parameters such as age or menopausal status to overcome these issues showed only weak improvements. It is necessary to identify novel molecular signatures and the development of new predictive algorithms able to support the diagnosis of HGSOC, and at the same time, deepen the understanding of this elusive disease, with the final goal of improving patient survival. Here, we apply a Machine Learning-based pipeline to an open-source HGSOC Proteomic dataset to develop a decision support system (DSS) that displayed high discerning ability on a dataset of HGSOC biopsies. The proposed DSS consists of a double-step feature selection and a decision tree, with the resulting output consisting of a combination of three highly discriminating proteins: TOP1, PDIA4, and OGN, that could be of interest for further clinical and experimental validation. Furthermore, we took advantage of the ranked list of proteins generated during the feature selection steps to perform a pathway analysis to provide a snapshot of the main deregulated pathways of HGSOC. The datasets used for this study are available in the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data portal (https://cptac-data-portal.georgetown.edu/).
format Online
Article
Text
id pubmed-8866540
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-88665402022-02-25 Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers Farinella, Federica Merone, Mario Bacco, Luca Capirchio, Adriano Ciccozzi, Massimo Caligiore, Daniele Sci Rep Article Ovarian cancer is one of the most common gynecological malignancies, ranking third after cervical and uterine cancer. High-grade serous ovarian cancer (HGSOC) is one of the most aggressive subtype, and the late onset of its symptoms leads in most cases to an unfavourable prognosis. Current predictive algorithms used to estimate the risk of having Ovarian Cancer fail to provide sufficient sensitivity and specificity to be used widely in clinical practice. The use of additional biomarkers or parameters such as age or menopausal status to overcome these issues showed only weak improvements. It is necessary to identify novel molecular signatures and the development of new predictive algorithms able to support the diagnosis of HGSOC, and at the same time, deepen the understanding of this elusive disease, with the final goal of improving patient survival. Here, we apply a Machine Learning-based pipeline to an open-source HGSOC Proteomic dataset to develop a decision support system (DSS) that displayed high discerning ability on a dataset of HGSOC biopsies. The proposed DSS consists of a double-step feature selection and a decision tree, with the resulting output consisting of a combination of three highly discriminating proteins: TOP1, PDIA4, and OGN, that could be of interest for further clinical and experimental validation. Furthermore, we took advantage of the ranked list of proteins generated during the feature selection steps to perform a pathway analysis to provide a snapshot of the main deregulated pathways of HGSOC. The datasets used for this study are available in the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data portal (https://cptac-data-portal.georgetown.edu/). Nature Publishing Group UK 2022-02-23 /pmc/articles/PMC8866540/ /pubmed/35197484 http://dx.doi.org/10.1038/s41598-022-06788-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Farinella, Federica
Merone, Mario
Bacco, Luca
Capirchio, Adriano
Ciccozzi, Massimo
Caligiore, Daniele
Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
title Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
title_full Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
title_fullStr Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
title_full_unstemmed Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
title_short Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
title_sort machine learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8866540/
https://www.ncbi.nlm.nih.gov/pubmed/35197484
http://dx.doi.org/10.1038/s41598-022-06788-2
work_keys_str_mv AT farinellafederica machinelearninganalysisofhighgradeserousovariancancerproteomicdatasetrevealsnovelcandidatebiomarkers
AT meronemario machinelearninganalysisofhighgradeserousovariancancerproteomicdatasetrevealsnovelcandidatebiomarkers
AT baccoluca machinelearninganalysisofhighgradeserousovariancancerproteomicdatasetrevealsnovelcandidatebiomarkers
AT capirchioadriano machinelearninganalysisofhighgradeserousovariancancerproteomicdatasetrevealsnovelcandidatebiomarkers
AT ciccozzimassimo machinelearninganalysisofhighgradeserousovariancancerproteomicdatasetrevealsnovelcandidatebiomarkers
AT caligioredaniele machinelearninganalysisofhighgradeserousovariancancerproteomicdatasetrevealsnovelcandidatebiomarkers