Cargando…

Sample Size Analysis for Machine Learning Clinical Validation Studies

Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model perform...

Descripción completa

Detalles Bibliográficos
Autores principales: Goldenholz, Daniel M., Sun, Haoqi, Ganglberger, Wolfgang, Westover, M. Brandon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10045793/
https://www.ncbi.nlm.nih.gov/pubmed/36979665
http://dx.doi.org/10.3390/biomedicines11030685
_version_ 1784913691612807168
author Goldenholz, Daniel M.
Sun, Haoqi
Ganglberger, Wolfgang
Westover, M. Brandon
author_facet Goldenholz, Daniel M.
Sun, Haoqi
Ganglberger, Wolfgang
Westover, M. Brandon
author_sort Goldenholz, Daniel M.
collection PubMed
description Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model performance. There is no standard tool for determining sample size estimates for clinical validation studies for machine learning models. Methods: Our open-source method, Sample Size Analysis for Machine Learning (SSAML) was described and was tested in three previously published models: brain age to predict mortality (Cox Proportional Hazard), COVID hospitalization risk prediction (ordinal regression), and seizure risk forecasting (deep learning). Results: Minimum sample sizes were obtained in each dataset using standardized criteria. Discussion: SSAML provides a formal expectation of precision and accuracy at a desired confidence level. SSAML is open-source and agnostic to data type and ML model. It can be used for clinical validation studies of ML models.
format Online
Article
Text
id pubmed-10045793
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100457932023-03-29 Sample Size Analysis for Machine Learning Clinical Validation Studies Goldenholz, Daniel M. Sun, Haoqi Ganglberger, Wolfgang Westover, M. Brandon Biomedicines Article Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model performance. There is no standard tool for determining sample size estimates for clinical validation studies for machine learning models. Methods: Our open-source method, Sample Size Analysis for Machine Learning (SSAML) was described and was tested in three previously published models: brain age to predict mortality (Cox Proportional Hazard), COVID hospitalization risk prediction (ordinal regression), and seizure risk forecasting (deep learning). Results: Minimum sample sizes were obtained in each dataset using standardized criteria. Discussion: SSAML provides a formal expectation of precision and accuracy at a desired confidence level. SSAML is open-source and agnostic to data type and ML model. It can be used for clinical validation studies of ML models. MDPI 2023-02-23 /pmc/articles/PMC10045793/ /pubmed/36979665 http://dx.doi.org/10.3390/biomedicines11030685 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Goldenholz, Daniel M.
Sun, Haoqi
Ganglberger, Wolfgang
Westover, M. Brandon
Sample Size Analysis for Machine Learning Clinical Validation Studies
title Sample Size Analysis for Machine Learning Clinical Validation Studies
title_full Sample Size Analysis for Machine Learning Clinical Validation Studies
title_fullStr Sample Size Analysis for Machine Learning Clinical Validation Studies
title_full_unstemmed Sample Size Analysis for Machine Learning Clinical Validation Studies
title_short Sample Size Analysis for Machine Learning Clinical Validation Studies
title_sort sample size analysis for machine learning clinical validation studies
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10045793/
https://www.ncbi.nlm.nih.gov/pubmed/36979665
http://dx.doi.org/10.3390/biomedicines11030685
work_keys_str_mv AT goldenholzdanielm samplesizeanalysisformachinelearningclinicalvalidationstudies
AT sunhaoqi samplesizeanalysisformachinelearningclinicalvalidationstudies
AT ganglbergerwolfgang samplesizeanalysisformachinelearningclinicalvalidationstudies
AT westovermbrandon samplesizeanalysisformachinelearningclinicalvalidationstudies