Cargando…

A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP

BACKGROUND: This study aimed to determine an optimal machine learning (ML) model for evaluating the preoperative diagnostic value of ultrasound signs of breast cancer lesions for sentinel lymph node (SLN) status. METHOD: This study retrospectively analyzed the ultrasound images and postoperative pat...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Gaosen, Shi, Yan, Yin, Peipei, Liu, Feifei, Fang, Yi, Li, Xiang, Zhang, Qingyu, Zhang, Zhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9359803/
https://www.ncbi.nlm.nih.gov/pubmed/35957890
http://dx.doi.org/10.3389/fonc.2022.944569
_version_ 1784764212559478784
author Zhang, Gaosen
Shi, Yan
Yin, Peipei
Liu, Feifei
Fang, Yi
Li, Xiang
Zhang, Qingyu
Zhang, Zhen
author_facet Zhang, Gaosen
Shi, Yan
Yin, Peipei
Liu, Feifei
Fang, Yi
Li, Xiang
Zhang, Qingyu
Zhang, Zhen
author_sort Zhang, Gaosen
collection PubMed
description BACKGROUND: This study aimed to determine an optimal machine learning (ML) model for evaluating the preoperative diagnostic value of ultrasound signs of breast cancer lesions for sentinel lymph node (SLN) status. METHOD: This study retrospectively analyzed the ultrasound images and postoperative pathological findings of lesions in 952 breast cancer patients. Firstly, the univariate analysis of the relationship between the ultrasonographic features of breast cancer morphological features and SLN metastasis. Then, based on the ultrasound signs of breast cancer lesions, we screened ten ML models: support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), naive bayesian model (NB), k-nearest neighbors (KNN), multilayer perceptron (MLP), long short-term memory (LSTM), and convolutional neural network (CNN). The diagnostic performance of the model was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), Kappa value, accuracy, F1-score, sensitivity, and specificity. Then we constructed a clinical prediction model which was based on the ML algorithm with the best diagnostic performance. Finally, we used SHapley Additive exPlanation (SHAP) to visualize and analyze the diagnostic process of the ML model. RESULTS: Of 952 patients with breast cancer, 394 (41.4%) had SLN metastasis, and 558 (58.6%) had no metastasis. Univariate analysis found that the shape, orientation, margin, posterior features, calculations, architectural distortion, duct changes and suspicious lymph node of breast cancer lesions in ultrasound signs were associated with SLN metastasis. Among the 10 ML algorithms, XGBoost had the best comprehensive diagnostic performance for SLN metastasis, with Average-AUC of 0.952, Average-Kappa of 0.763, and Average-Accuracy of 0.891. The AUC of the XGBoost model in the validation cohort was 0.916, the accuracy was 0.846, the sensitivity was 0.870, the specificity was 0.862, and the F1-score was 0.826. The diagnostic performance of the XGBoost model was significantly higher than that of experienced radiologists in some cases (P<0.001). Using SHAP to visualize the interpretation of the ML model screen, it was found that the ultrasonic detection of suspicious lymph nodes, microcalcifications in the primary tumor, burrs on the edge of the primary tumor, and distortion of the tissue structure around the lesion contributed greatly to the diagnostic performance of the XGBoost model. CONCLUSIONS: The XGBoost model based on the ultrasound signs of the primary breast tumor and its surrounding tissues and lymph nodes has a high diagnostic performance for predicting SLN metastasis. Visual explanation using SHAP made it an effective tool for guiding clinical courses preoperatively.
format Online
Article
Text
id pubmed-9359803
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93598032022-08-10 A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP Zhang, Gaosen Shi, Yan Yin, Peipei Liu, Feifei Fang, Yi Li, Xiang Zhang, Qingyu Zhang, Zhen Front Oncol Oncology BACKGROUND: This study aimed to determine an optimal machine learning (ML) model for evaluating the preoperative diagnostic value of ultrasound signs of breast cancer lesions for sentinel lymph node (SLN) status. METHOD: This study retrospectively analyzed the ultrasound images and postoperative pathological findings of lesions in 952 breast cancer patients. Firstly, the univariate analysis of the relationship between the ultrasonographic features of breast cancer morphological features and SLN metastasis. Then, based on the ultrasound signs of breast cancer lesions, we screened ten ML models: support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), naive bayesian model (NB), k-nearest neighbors (KNN), multilayer perceptron (MLP), long short-term memory (LSTM), and convolutional neural network (CNN). The diagnostic performance of the model was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), Kappa value, accuracy, F1-score, sensitivity, and specificity. Then we constructed a clinical prediction model which was based on the ML algorithm with the best diagnostic performance. Finally, we used SHapley Additive exPlanation (SHAP) to visualize and analyze the diagnostic process of the ML model. RESULTS: Of 952 patients with breast cancer, 394 (41.4%) had SLN metastasis, and 558 (58.6%) had no metastasis. Univariate analysis found that the shape, orientation, margin, posterior features, calculations, architectural distortion, duct changes and suspicious lymph node of breast cancer lesions in ultrasound signs were associated with SLN metastasis. Among the 10 ML algorithms, XGBoost had the best comprehensive diagnostic performance for SLN metastasis, with Average-AUC of 0.952, Average-Kappa of 0.763, and Average-Accuracy of 0.891. The AUC of the XGBoost model in the validation cohort was 0.916, the accuracy was 0.846, the sensitivity was 0.870, the specificity was 0.862, and the F1-score was 0.826. The diagnostic performance of the XGBoost model was significantly higher than that of experienced radiologists in some cases (P<0.001). Using SHAP to visualize the interpretation of the ML model screen, it was found that the ultrasonic detection of suspicious lymph nodes, microcalcifications in the primary tumor, burrs on the edge of the primary tumor, and distortion of the tissue structure around the lesion contributed greatly to the diagnostic performance of the XGBoost model. CONCLUSIONS: The XGBoost model based on the ultrasound signs of the primary breast tumor and its surrounding tissues and lymph nodes has a high diagnostic performance for predicting SLN metastasis. Visual explanation using SHAP made it an effective tool for guiding clinical courses preoperatively. Frontiers Media S.A. 2022-07-25 /pmc/articles/PMC9359803/ /pubmed/35957890 http://dx.doi.org/10.3389/fonc.2022.944569 Text en Copyright © 2022 Zhang, Shi, Yin, Liu, Fang, Li, Zhang and Zhang https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Zhang, Gaosen
Shi, Yan
Yin, Peipei
Liu, Feifei
Fang, Yi
Li, Xiang
Zhang, Qingyu
Zhang, Zhen
A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP
title A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP
title_full A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP
title_fullStr A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP
title_full_unstemmed A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP
title_short A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP
title_sort machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: applications of scikit-learn and shap
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9359803/
https://www.ncbi.nlm.nih.gov/pubmed/35957890
http://dx.doi.org/10.3389/fonc.2022.944569
work_keys_str_mv AT zhanggaosen amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT shiyan amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT yinpeipei amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT liufeifei amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT fangyi amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT lixiang amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT zhangqingyu amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT zhangzhen amachinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT zhanggaosen machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT shiyan machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT yinpeipei machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT liufeifei machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT fangyi machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT lixiang machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT zhangqingyu machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap
AT zhangzhen machinelearningmodelbasedonultrasoundimagefeaturestoassesstheriskofsentinellymphnodemetastasisinbreastcancerpatientsapplicationsofscikitlearnandshap