Cargando…
Sample Size Analysis for Machine Learning Clinical Validation Studies
Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model perform...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10045793/ https://www.ncbi.nlm.nih.gov/pubmed/36979665 http://dx.doi.org/10.3390/biomedicines11030685 |
_version_ | 1784913691612807168 |
---|---|
author | Goldenholz, Daniel M. Sun, Haoqi Ganglberger, Wolfgang Westover, M. Brandon |
author_facet | Goldenholz, Daniel M. Sun, Haoqi Ganglberger, Wolfgang Westover, M. Brandon |
author_sort | Goldenholz, Daniel M. |
collection | PubMed |
description | Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model performance. There is no standard tool for determining sample size estimates for clinical validation studies for machine learning models. Methods: Our open-source method, Sample Size Analysis for Machine Learning (SSAML) was described and was tested in three previously published models: brain age to predict mortality (Cox Proportional Hazard), COVID hospitalization risk prediction (ordinal regression), and seizure risk forecasting (deep learning). Results: Minimum sample sizes were obtained in each dataset using standardized criteria. Discussion: SSAML provides a formal expectation of precision and accuracy at a desired confidence level. SSAML is open-source and agnostic to data type and ML model. It can be used for clinical validation studies of ML models. |
format | Online Article Text |
id | pubmed-10045793 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100457932023-03-29 Sample Size Analysis for Machine Learning Clinical Validation Studies Goldenholz, Daniel M. Sun, Haoqi Ganglberger, Wolfgang Westover, M. Brandon Biomedicines Article Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model performance. There is no standard tool for determining sample size estimates for clinical validation studies for machine learning models. Methods: Our open-source method, Sample Size Analysis for Machine Learning (SSAML) was described and was tested in three previously published models: brain age to predict mortality (Cox Proportional Hazard), COVID hospitalization risk prediction (ordinal regression), and seizure risk forecasting (deep learning). Results: Minimum sample sizes were obtained in each dataset using standardized criteria. Discussion: SSAML provides a formal expectation of precision and accuracy at a desired confidence level. SSAML is open-source and agnostic to data type and ML model. It can be used for clinical validation studies of ML models. MDPI 2023-02-23 /pmc/articles/PMC10045793/ /pubmed/36979665 http://dx.doi.org/10.3390/biomedicines11030685 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Goldenholz, Daniel M. Sun, Haoqi Ganglberger, Wolfgang Westover, M. Brandon Sample Size Analysis for Machine Learning Clinical Validation Studies |
title | Sample Size Analysis for Machine Learning Clinical Validation Studies |
title_full | Sample Size Analysis for Machine Learning Clinical Validation Studies |
title_fullStr | Sample Size Analysis for Machine Learning Clinical Validation Studies |
title_full_unstemmed | Sample Size Analysis for Machine Learning Clinical Validation Studies |
title_short | Sample Size Analysis for Machine Learning Clinical Validation Studies |
title_sort | sample size analysis for machine learning clinical validation studies |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10045793/ https://www.ncbi.nlm.nih.gov/pubmed/36979665 http://dx.doi.org/10.3390/biomedicines11030685 |
work_keys_str_mv | AT goldenholzdanielm samplesizeanalysisformachinelearningclinicalvalidationstudies AT sunhaoqi samplesizeanalysisformachinelearningclinicalvalidationstudies AT ganglbergerwolfgang samplesizeanalysisformachinelearningclinicalvalidationstudies AT westovermbrandon samplesizeanalysisformachinelearningclinicalvalidationstudies |