Cargando…

Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer

BACKGROUND: Prognostic stratification of breast cancers remains a challenge to improve clinical decision making. We employ machine learning on breast cancer transcriptomics from multiple studies to link the expression of specific genes to histological grade and classify tumours into a more or less a...

Descripción completa

Detalles Bibliográficos
Autores principales: Amiri Souri, E., Chenoweth, A., Cheung, A., Karagiannis, S. N., Tsoka, S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8405688/
https://www.ncbi.nlm.nih.gov/pubmed/34131308
http://dx.doi.org/10.1038/s41416-021-01455-1
_version_ 1783746379263246336
author Amiri Souri, E.
Chenoweth, A.
Cheung, A.
Karagiannis, S. N.
Tsoka, S.
author_facet Amiri Souri, E.
Chenoweth, A.
Cheung, A.
Karagiannis, S. N.
Tsoka, S.
author_sort Amiri Souri, E.
collection PubMed
description BACKGROUND: Prognostic stratification of breast cancers remains a challenge to improve clinical decision making. We employ machine learning on breast cancer transcriptomics from multiple studies to link the expression of specific genes to histological grade and classify tumours into a more or less aggressive prognostic type. MATERIALS AND METHODS: Microarray data of 5031 untreated breast tumours spanning 33 published datasets and corresponding clinical data were integrated. A machine learning model based on gradient boosted trees was trained on histological grade-1 and grade-3 samples. The resulting predictive model (Cancer Grade Model, CGM) was applied on samples of grade-2 and unknown-grade (3029) for prognostic risk classification. RESULTS: A 70-gene signature for assessing clinical risk was identified and was shown to be 90% accurate when tested on known histological-grade samples. The predictive framework was validated through survival analysis and showed robust prognostic performance. CGM was cross-referenced with existing genomic tests and demonstrated the competitive predictive power of tumour risk. CONCLUSIONS: CGM is able to classify tumours into better-defined prognostic categories without employing information on tumour size, stage, or subgroups. The model offers means to improve prognosis and support the clinical decision and precision treatments, thereby potentially contributing to preventing underdiagnosis of high-risk tumours and minimising over-treatment of low-risk disease.
format Online
Article
Text
id pubmed-8405688
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-84056882021-09-16 Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer Amiri Souri, E. Chenoweth, A. Cheung, A. Karagiannis, S. N. Tsoka, S. Br J Cancer Article BACKGROUND: Prognostic stratification of breast cancers remains a challenge to improve clinical decision making. We employ machine learning on breast cancer transcriptomics from multiple studies to link the expression of specific genes to histological grade and classify tumours into a more or less aggressive prognostic type. MATERIALS AND METHODS: Microarray data of 5031 untreated breast tumours spanning 33 published datasets and corresponding clinical data were integrated. A machine learning model based on gradient boosted trees was trained on histological grade-1 and grade-3 samples. The resulting predictive model (Cancer Grade Model, CGM) was applied on samples of grade-2 and unknown-grade (3029) for prognostic risk classification. RESULTS: A 70-gene signature for assessing clinical risk was identified and was shown to be 90% accurate when tested on known histological-grade samples. The predictive framework was validated through survival analysis and showed robust prognostic performance. CGM was cross-referenced with existing genomic tests and demonstrated the competitive predictive power of tumour risk. CONCLUSIONS: CGM is able to classify tumours into better-defined prognostic categories without employing information on tumour size, stage, or subgroups. The model offers means to improve prognosis and support the clinical decision and precision treatments, thereby potentially contributing to preventing underdiagnosis of high-risk tumours and minimising over-treatment of low-risk disease. Nature Publishing Group UK 2021-06-15 2021-08-31 /pmc/articles/PMC8405688/ /pubmed/34131308 http://dx.doi.org/10.1038/s41416-021-01455-1 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Amiri Souri, E.
Chenoweth, A.
Cheung, A.
Karagiannis, S. N.
Tsoka, S.
Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
title Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
title_full Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
title_fullStr Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
title_full_unstemmed Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
title_short Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
title_sort cancer grade model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8405688/
https://www.ncbi.nlm.nih.gov/pubmed/34131308
http://dx.doi.org/10.1038/s41416-021-01455-1
work_keys_str_mv AT amirisourie cancergrademodelamultigenemachinelearningbasedriskclassificationforimprovingprognosisinbreastcancer
AT chenowetha cancergrademodelamultigenemachinelearningbasedriskclassificationforimprovingprognosisinbreastcancer
AT cheunga cancergrademodelamultigenemachinelearningbasedriskclassificationforimprovingprognosisinbreastcancer
AT karagiannissn cancergrademodelamultigenemachinelearningbasedriskclassificationforimprovingprognosisinbreastcancer
AT tsokas cancergrademodelamultigenemachinelearningbasedriskclassificationforimprovingprognosisinbreastcancer