Cargando…
Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods
Breast cancer is the most common malignancy in women, and because it has a high mortality rate, it is urgent to develop computational methods to increase the accuracy of breast cancer survival predictive models. Although multi-omics data such as gene expression have been extensively used in recent s...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7848170/ https://www.ncbi.nlm.nih.gov/pubmed/33537063 http://dx.doi.org/10.3389/fgene.2020.632901 |
_version_ | 1783645073540382720 |
---|---|
author | He, Zongzhen Zhang, Junying Yuan, Xiguo Zhang, Yuanyuan |
author_facet | He, Zongzhen Zhang, Junying Yuan, Xiguo Zhang, Yuanyuan |
author_sort | He, Zongzhen |
collection | PubMed |
description | Breast cancer is the most common malignancy in women, and because it has a high mortality rate, it is urgent to develop computational methods to increase the accuracy of breast cancer survival predictive models. Although multi-omics data such as gene expression have been extensively used in recent studies, the accurate prognosis of breast cancer remains a challenge. Somatic mutations are another important and promising data source for studying cancer development, and its effect on the prognosis of breast cancer remains to be further explored. Meanwhile, these omics datasets are high-dimensional and redundant. Therefore, we adopted multiple kernel learning (MKL) to efficiently integrate somatic mutation to currently molecular data including gene expression, copy number variation (CNV), methylation, and protein expression data for the prediction of breast cancer survival. Before integration, the maximum relevance minimum redundancy (mRMR) feature selection method was utilized to select features that present high relevance to survival and low redundancy among themselves for each type of data. The experimental results demonstrated that the proposed method achieved the most optimal performance and there was a remarkable improvement in the prediction performance when somatic mutations were included, indicating that somatic mutations are critical for improving breast cancer survival predictions. Moreover, mRMR was superior to other feature selection methods used in previous studies. Furthermore, MKL outperformed the other traditional classifiers in multi-omics data integration. Our analysis indicated that through employing promising omics data such as somatic mutations and harnessing the power of proper feature selection methods and effective integration frameworks, the breast cancer survival predictive accuracy can be further increased, thereby providing a more optimal clinical diagnosis and more effective treatment for breast cancer patients. |
format | Online Article Text |
id | pubmed-7848170 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78481702021-02-02 Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods He, Zongzhen Zhang, Junying Yuan, Xiguo Zhang, Yuanyuan Front Genet Genetics Breast cancer is the most common malignancy in women, and because it has a high mortality rate, it is urgent to develop computational methods to increase the accuracy of breast cancer survival predictive models. Although multi-omics data such as gene expression have been extensively used in recent studies, the accurate prognosis of breast cancer remains a challenge. Somatic mutations are another important and promising data source for studying cancer development, and its effect on the prognosis of breast cancer remains to be further explored. Meanwhile, these omics datasets are high-dimensional and redundant. Therefore, we adopted multiple kernel learning (MKL) to efficiently integrate somatic mutation to currently molecular data including gene expression, copy number variation (CNV), methylation, and protein expression data for the prediction of breast cancer survival. Before integration, the maximum relevance minimum redundancy (mRMR) feature selection method was utilized to select features that present high relevance to survival and low redundancy among themselves for each type of data. The experimental results demonstrated that the proposed method achieved the most optimal performance and there was a remarkable improvement in the prediction performance when somatic mutations were included, indicating that somatic mutations are critical for improving breast cancer survival predictions. Moreover, mRMR was superior to other feature selection methods used in previous studies. Furthermore, MKL outperformed the other traditional classifiers in multi-omics data integration. Our analysis indicated that through employing promising omics data such as somatic mutations and harnessing the power of proper feature selection methods and effective integration frameworks, the breast cancer survival predictive accuracy can be further increased, thereby providing a more optimal clinical diagnosis and more effective treatment for breast cancer patients. Frontiers Media S.A. 2021-01-18 /pmc/articles/PMC7848170/ /pubmed/33537063 http://dx.doi.org/10.3389/fgene.2020.632901 Text en Copyright © 2021 He, Zhang, Yuan and Zhang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics He, Zongzhen Zhang, Junying Yuan, Xiguo Zhang, Yuanyuan Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods |
title | Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods |
title_full | Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods |
title_fullStr | Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods |
title_full_unstemmed | Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods |
title_short | Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods |
title_sort | integrating somatic mutations for breast cancer survival prediction using machine learning methods |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7848170/ https://www.ncbi.nlm.nih.gov/pubmed/33537063 http://dx.doi.org/10.3389/fgene.2020.632901 |
work_keys_str_mv | AT hezongzhen integratingsomaticmutationsforbreastcancersurvivalpredictionusingmachinelearningmethods AT zhangjunying integratingsomaticmutationsforbreastcancersurvivalpredictionusingmachinelearningmethods AT yuanxiguo integratingsomaticmutationsforbreastcancersurvivalpredictionusingmachinelearningmethods AT zhangyuanyuan integratingsomaticmutationsforbreastcancersurvivalpredictionusingmachinelearningmethods |