Cargando…
Multi-objective hyperparameter optimization on gradient-boosting for breast cancer detection
The most commonly occurring cancer among women, breast cancer, causes lakhs of deaths annually, which can be prevented by early detection and treatment. Detection can be done by using machine learning models on histopathological images which are affordable, reliable, and accurate. Previous studies i...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer India
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255948/ http://dx.doi.org/10.1007/s13198-023-01955-8 |
Sumario: | The most commonly occurring cancer among women, breast cancer, causes lakhs of deaths annually, which can be prevented by early detection and treatment. Detection can be done by using machine learning models on histopathological images which are affordable, reliable, and accurate. Previous studies in this regard have focused on transfer learning methods combining feature selection using Convolutional Neural Networks (CNNs) and an ensemble of gradient-boosting algorithms. However, none of the state-of-the-art techniques capture the multi-objective nature of Breast Cancer Detection (BCD) and tend to improve a single performance measure such as Accuracy and F1 score, which fail to capture certain essential aspects of the problem as the cost of misclassification varies greatly depending on its type. In this study, a multi-objective hyperparameter optimization technique for Breast Cancer Prediction is proposed by comparing random search, Non-Dominated Sorting Genetic Algorithm (NSGA-II) and Bayesian optimization. This approach is applied to an ensemble of three popular gradient-boosting techniques: extreme gradient-boosting, light gradient-boosting machine and categorical boosting on features obtained from Inception-ResNet-v2 CNN model applied on the benchmark BreakHis dataset to optimize Precision, Recall, Accuracy, and AUC simultaneously. The novel NSGA2-IRv2-CXL model proposed in this study achieves maximum Accuracy of 94.40%, AUC of 98.16, Precision of 95.77%, and Recall of 99.29% for 100[Formula: see text] magnification. The study also establishes trade-offs between performance metrics thereby opening avenues for further research in multi-objective approaches to BCD which can provide a larger view of the strengths and weaknesses of the classification model. |
---|