Cargando…
Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effe...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
De Gruyter
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8122465/ https://www.ncbi.nlm.nih.gov/pubmed/34027105 http://dx.doi.org/10.1515/med-2021-0282 |
_version_ | 1783692625450106880 |
---|---|
author | Yang, Pei-Tse Wu, Wen-Shuo Wu, Chia-Chun Shih, Yi-Nuo Hsieh, Chung-Ho Hsu, Jia-Lien |
author_facet | Yang, Pei-Tse Wu, Wen-Shuo Wu, Chia-Chun Shih, Yi-Nuo Hsieh, Chung-Ho Hsu, Jia-Lien |
author_sort | Yang, Pei-Tse |
collection | PubMed |
description | Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effects, and even death. The prediction of breast cancer recurrence is crucial for reducing mortality. This paper proposes a prediction model for the recurrence of breast cancer based on clinical nominal and numeric features. In this study, our data consist of 1,061 patients from Breast Cancer Registry from Shin Kong Wu Ho-Su Memorial Hospital between 2011 and 2016, in which 37 records are denoted as breast cancer recurrence. Each record has 85 features. Our approach consists of three stages. First, we perform data preprocessing and feature selection techniques to consolidate the dataset. Among all features, six features are identified for further processing in the following stages. Next, we apply resampling techniques to resolve the issue of class imbalance. Finally, we construct two classifiers, AdaBoost and cost-sensitive learning, to predict the risk of recurrence and carry out the performance evaluation in three-fold cross-validation. By applying the AdaBoost method, we achieve accuracy of 0.973 and sensitivity of 0.675. By combining the AdaBoost and cost-sensitive method of our model, we achieve a reasonable accuracy of 0.468 and substantially high sensitivity of 0.947 which guarantee almost no false dismissal. Our model can be used as a supporting tool in the setting and evaluation of the follow-up visit for early intervention and more advanced treatments to lower cancer mortality. |
format | Online Article Text |
id | pubmed-8122465 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | De Gruyter |
record_format | MEDLINE/PubMed |
spelling | pubmed-81224652021-05-21 Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning Yang, Pei-Tse Wu, Wen-Shuo Wu, Chia-Chun Shih, Yi-Nuo Hsieh, Chung-Ho Hsu, Jia-Lien Open Med (Wars) Research Article Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effects, and even death. The prediction of breast cancer recurrence is crucial for reducing mortality. This paper proposes a prediction model for the recurrence of breast cancer based on clinical nominal and numeric features. In this study, our data consist of 1,061 patients from Breast Cancer Registry from Shin Kong Wu Ho-Su Memorial Hospital between 2011 and 2016, in which 37 records are denoted as breast cancer recurrence. Each record has 85 features. Our approach consists of three stages. First, we perform data preprocessing and feature selection techniques to consolidate the dataset. Among all features, six features are identified for further processing in the following stages. Next, we apply resampling techniques to resolve the issue of class imbalance. Finally, we construct two classifiers, AdaBoost and cost-sensitive learning, to predict the risk of recurrence and carry out the performance evaluation in three-fold cross-validation. By applying the AdaBoost method, we achieve accuracy of 0.973 and sensitivity of 0.675. By combining the AdaBoost and cost-sensitive method of our model, we achieve a reasonable accuracy of 0.468 and substantially high sensitivity of 0.947 which guarantee almost no false dismissal. Our model can be used as a supporting tool in the setting and evaluation of the follow-up visit for early intervention and more advanced treatments to lower cancer mortality. De Gruyter 2021-05-13 /pmc/articles/PMC8122465/ /pubmed/34027105 http://dx.doi.org/10.1515/med-2021-0282 Text en © 2021 Pei-Tse Yang et al., published by De Gruyter https://creativecommons.org/licenses/by/4.0/This work is licensed under the Creative Commons Attribution 4.0 International License. |
spellingShingle | Research Article Yang, Pei-Tse Wu, Wen-Shuo Wu, Chia-Chun Shih, Yi-Nuo Hsieh, Chung-Ho Hsu, Jia-Lien Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
title | Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
title_full | Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
title_fullStr | Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
title_full_unstemmed | Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
title_short | Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
title_sort | breast cancer recurrence prediction with ensemble methods and cost-sensitive learning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8122465/ https://www.ncbi.nlm.nih.gov/pubmed/34027105 http://dx.doi.org/10.1515/med-2021-0282 |
work_keys_str_mv | AT yangpeitse breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning AT wuwenshuo breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning AT wuchiachun breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning AT shihyinuo breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning AT hsiehchungho breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning AT hsujialien breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning |