Cargando…
Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer
Gene expression signatures refer to patterns of gene activities and are used to classify different types of cancer, determine prognosis, and guide treatment decisions. Advancements in high-throughput technology and machine learning have led to improvements to predict a patient’s prognosis for differ...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10556090/ https://www.ncbi.nlm.nih.gov/pubmed/37798300 http://dx.doi.org/10.1038/s41598-023-41090-9 |
_version_ | 1785116803520790528 |
---|---|
author | Tschodu, Dimitrij Lippoldt, Jürgen Gottheil, Pablo Wegscheider, Anne-Sophie Käs, Josef A. Niendorf, Axel |
author_facet | Tschodu, Dimitrij Lippoldt, Jürgen Gottheil, Pablo Wegscheider, Anne-Sophie Käs, Josef A. Niendorf, Axel |
author_sort | Tschodu, Dimitrij |
collection | PubMed |
description | Gene expression signatures refer to patterns of gene activities and are used to classify different types of cancer, determine prognosis, and guide treatment decisions. Advancements in high-throughput technology and machine learning have led to improvements to predict a patient’s prognosis for different cancer phenotypes. However, computational methods for analyzing signatures have not been used to evaluate their prognostic power. Contention remains on the utility of gene expression signatures for prognosis. The prevalent approaches include random signatures, expert knowledge, and machine learning to construct an improved signature. We unify these approaches to evaluate their prognostic power. Re-evaluation of publicly available gene-expression data from 8 databases with 9 machine-learning models revealed previously unreported results. Gene-expression signatures are confirmed to be useful in predicting a patient’s prognosis. Convergent evidence from [Formula: see text] 10,000 signatures implicates a maximum prognostic power. By calculating the concordance index, which measures how well patients with different prognoses can be discriminated, we show that a signature can correctly discriminate patients’ prognoses no more than 80% of the time. Additionally, we show that more than 50% of the potentially available information is still missing at this value. We surmise that an accurate prognosis must incorporate molecular, clinical, histological, and other complementary factors. |
format | Online Article Text |
id | pubmed-10556090 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-105560902023-10-07 Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer Tschodu, Dimitrij Lippoldt, Jürgen Gottheil, Pablo Wegscheider, Anne-Sophie Käs, Josef A. Niendorf, Axel Sci Rep Article Gene expression signatures refer to patterns of gene activities and are used to classify different types of cancer, determine prognosis, and guide treatment decisions. Advancements in high-throughput technology and machine learning have led to improvements to predict a patient’s prognosis for different cancer phenotypes. However, computational methods for analyzing signatures have not been used to evaluate their prognostic power. Contention remains on the utility of gene expression signatures for prognosis. The prevalent approaches include random signatures, expert knowledge, and machine learning to construct an improved signature. We unify these approaches to evaluate their prognostic power. Re-evaluation of publicly available gene-expression data from 8 databases with 9 machine-learning models revealed previously unreported results. Gene-expression signatures are confirmed to be useful in predicting a patient’s prognosis. Convergent evidence from [Formula: see text] 10,000 signatures implicates a maximum prognostic power. By calculating the concordance index, which measures how well patients with different prognoses can be discriminated, we show that a signature can correctly discriminate patients’ prognoses no more than 80% of the time. Additionally, we show that more than 50% of the potentially available information is still missing at this value. We surmise that an accurate prognosis must incorporate molecular, clinical, histological, and other complementary factors. Nature Publishing Group UK 2023-10-05 /pmc/articles/PMC10556090/ /pubmed/37798300 http://dx.doi.org/10.1038/s41598-023-41090-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Tschodu, Dimitrij Lippoldt, Jürgen Gottheil, Pablo Wegscheider, Anne-Sophie Käs, Josef A. Niendorf, Axel Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
title | Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
title_full | Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
title_fullStr | Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
title_full_unstemmed | Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
title_short | Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
title_sort | re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10556090/ https://www.ncbi.nlm.nih.gov/pubmed/37798300 http://dx.doi.org/10.1038/s41598-023-41090-9 |
work_keys_str_mv | AT tschodudimitrij reevaluationofpubliclyavailablegeneexpressiondatabasesusingmachinelearningyieldsamaximumprognosticpowerinbreastcancer AT lippoldtjurgen reevaluationofpubliclyavailablegeneexpressiondatabasesusingmachinelearningyieldsamaximumprognosticpowerinbreastcancer AT gottheilpablo reevaluationofpubliclyavailablegeneexpressiondatabasesusingmachinelearningyieldsamaximumprognosticpowerinbreastcancer AT wegscheiderannesophie reevaluationofpubliclyavailablegeneexpressiondatabasesusingmachinelearningyieldsamaximumprognosticpowerinbreastcancer AT kasjosefa reevaluationofpubliclyavailablegeneexpressiondatabasesusingmachinelearningyieldsamaximumprognosticpowerinbreastcancer AT niendorfaxel reevaluationofpubliclyavailablegeneexpressiondatabasesusingmachinelearningyieldsamaximumprognosticpowerinbreastcancer |