Cargando…
Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol
Background: Breast Cancer (BC) is a known global crisis. The World Health Organization reports a global 2.09 million incidences and 627,000 deaths in 2018 relating to BC. The traditional BC screening method in developed countries is mammography, whilst developing countries employ breast self-examina...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PAGEPress Publications, Pavia, Italy
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902303/ https://www.ncbi.nlm.nih.gov/pubmed/31857990 http://dx.doi.org/10.4081/jphr.2019.1677 |
_version_ | 1783477638570967040 |
---|---|
author | Salod, Zakia Singh, Yashik |
author_facet | Salod, Zakia Singh, Yashik |
author_sort | Salod, Zakia |
collection | PubMed |
description | Background: Breast Cancer (BC) is a known global crisis. The World Health Organization reports a global 2.09 million incidences and 627,000 deaths in 2018 relating to BC. The traditional BC screening method in developed countries is mammography, whilst developing countries employ breast self-examination and clinical breast examination. The prominent gold standard for BC detection is triple assessment: i) clinical examination, ii) mammography and/or ultrasonography; and iii) Fine Needle Aspirate Cytology. However, the introduction of cheaper, efficient and noninvasive methods of BC screening and detection would be beneficial. Design and methods: We propose the use of eight machine learning algorithms: i) Logistic Regression; ii) Support Vector Machine; iii) K-Nearest Neighbors; iv) Decision Tree; v) Random Forest; vi) Adaptive Boosting; vii) Gradient Boosting; viii) eXtreme Gradient Boosting, and blood test results using BC Coimbra Dataset (BCCD) from University of California Irvine online database to create models for BC prediction. To ensure the models’ robustness, we will employ: i) Stratified k-fold Cross- Validation; ii) Correlation-based Feature Selection (CFS); and iii) parameter tuning. The models will be validated on validation and test sets of BCCD for full features and reduced features. Feature reduction has an impact on algorithm performance. Seven metrics will be used for model evaluation, including accuracy. Expected impact of the study for public health: The CFS together with highest performing model(s) can serve to identify important specific blood tests that point towards BC, which may serve as an important BC biomarker. Highest performing model(s) may eventually be used to create an Artificial Intelligence tool to assist clinicians in BC screening and detection. |
format | Online Article Text |
id | pubmed-6902303 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | PAGEPress Publications, Pavia, Italy |
record_format | MEDLINE/PubMed |
spelling | pubmed-69023032019-12-19 Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol Salod, Zakia Singh, Yashik J Public Health Res Study Protocol Background: Breast Cancer (BC) is a known global crisis. The World Health Organization reports a global 2.09 million incidences and 627,000 deaths in 2018 relating to BC. The traditional BC screening method in developed countries is mammography, whilst developing countries employ breast self-examination and clinical breast examination. The prominent gold standard for BC detection is triple assessment: i) clinical examination, ii) mammography and/or ultrasonography; and iii) Fine Needle Aspirate Cytology. However, the introduction of cheaper, efficient and noninvasive methods of BC screening and detection would be beneficial. Design and methods: We propose the use of eight machine learning algorithms: i) Logistic Regression; ii) Support Vector Machine; iii) K-Nearest Neighbors; iv) Decision Tree; v) Random Forest; vi) Adaptive Boosting; vii) Gradient Boosting; viii) eXtreme Gradient Boosting, and blood test results using BC Coimbra Dataset (BCCD) from University of California Irvine online database to create models for BC prediction. To ensure the models’ robustness, we will employ: i) Stratified k-fold Cross- Validation; ii) Correlation-based Feature Selection (CFS); and iii) parameter tuning. The models will be validated on validation and test sets of BCCD for full features and reduced features. Feature reduction has an impact on algorithm performance. Seven metrics will be used for model evaluation, including accuracy. Expected impact of the study for public health: The CFS together with highest performing model(s) can serve to identify important specific blood tests that point towards BC, which may serve as an important BC biomarker. Highest performing model(s) may eventually be used to create an Artificial Intelligence tool to assist clinicians in BC screening and detection. PAGEPress Publications, Pavia, Italy 2019-12-04 /pmc/articles/PMC6902303/ /pubmed/31857990 http://dx.doi.org/10.4081/jphr.2019.1677 Text en ©Copyright: the Author(s), 2019 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution Noncommercial License (by-nc 4.0) which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. |
spellingShingle | Study Protocol Salod, Zakia Singh, Yashik Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol |
title | Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol |
title_full | Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol |
title_fullStr | Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol |
title_full_unstemmed | Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol |
title_short | Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol |
title_sort | comparison of the performance of machine learning algorithms in breast cancer screening and detection: a protocol |
topic | Study Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902303/ https://www.ncbi.nlm.nih.gov/pubmed/31857990 http://dx.doi.org/10.4081/jphr.2019.1677 |
work_keys_str_mv | AT salodzakia comparisonoftheperformanceofmachinelearningalgorithmsinbreastcancerscreeninganddetectionaprotocol AT singhyashik comparisonoftheperformanceofmachinelearningalgorithmsinbreastcancerscreeninganddetectionaprotocol |