Cargando…

A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data

BACKGROUND: Estimating the required sample size is crucial when developing and validating clinical prediction models. However, there is no consensus about how to determine the sample size in such a setting. Here, the goal was to compare available methods to define a practical solution to sample size...

Descripción completa

Detalles Bibliográficos
Autores principales: Baeza-Delgado, Carlos, Cerdá Alberich, Leonor, Carot-Sierra, José Miguel, Veiga-Canuto, Diana, Martínez de las Heras, Blanca, Raza, Ben, Martí-Bonmatí, Luis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Vienna 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9156610/
https://www.ncbi.nlm.nih.gov/pubmed/35641659
http://dx.doi.org/10.1186/s41747-022-00276-y
_version_ 1784718475955011584
author Baeza-Delgado, Carlos
Cerdá Alberich, Leonor
Carot-Sierra, José Miguel
Veiga-Canuto, Diana
Martínez de las Heras, Blanca
Raza, Ben
Martí-Bonmatí, Luis
author_facet Baeza-Delgado, Carlos
Cerdá Alberich, Leonor
Carot-Sierra, José Miguel
Veiga-Canuto, Diana
Martínez de las Heras, Blanca
Raza, Ben
Martí-Bonmatí, Luis
author_sort Baeza-Delgado, Carlos
collection PubMed
description BACKGROUND: Estimating the required sample size is crucial when developing and validating clinical prediction models. However, there is no consensus about how to determine the sample size in such a setting. Here, the goal was to compare available methods to define a practical solution to sample size estimation for clinical predictive models, as applied to Horizon 2020 PRIMAGE as a case study. METHODS: Three different methods (Riley’s; “rule of thumb” with 10 and 5 events per predictor) were employed to calculate the sample size required to develop predictive models to analyse the variation in sample size as a function of different parameters. Subsequently, the sample size for model validation was also estimated. RESULTS: To develop reliable predictive models, 1397 neuroblastoma patients are required, 1060 high-risk neuroblastoma patients and 1345 diffuse intrinsic pontine glioma (DIPG) patients. This sample size can be lowered by reducing the number of variables included in the model, by including direct measures of the outcome to be predicted and/or by increasing the follow-up period. For model validation, the estimated sample size resulted to be 326 patients for neuroblastoma, 246 for high-risk neuroblastoma, and 592 for DIPG. CONCLUSIONS: Given the variability of the different sample sizes obtained, we recommend using methods based on epidemiological data and the nature of the results, as the results are tailored to the specific clinical problem. In addition, sample size can be reduced by lowering the number of parameter predictors, by including direct measures of the outcome of interest.
format Online
Article
Text
id pubmed-9156610
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Vienna
record_format MEDLINE/PubMed
spelling pubmed-91566102022-06-02 A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data Baeza-Delgado, Carlos Cerdá Alberich, Leonor Carot-Sierra, José Miguel Veiga-Canuto, Diana Martínez de las Heras, Blanca Raza, Ben Martí-Bonmatí, Luis Eur Radiol Exp Original Article BACKGROUND: Estimating the required sample size is crucial when developing and validating clinical prediction models. However, there is no consensus about how to determine the sample size in such a setting. Here, the goal was to compare available methods to define a practical solution to sample size estimation for clinical predictive models, as applied to Horizon 2020 PRIMAGE as a case study. METHODS: Three different methods (Riley’s; “rule of thumb” with 10 and 5 events per predictor) were employed to calculate the sample size required to develop predictive models to analyse the variation in sample size as a function of different parameters. Subsequently, the sample size for model validation was also estimated. RESULTS: To develop reliable predictive models, 1397 neuroblastoma patients are required, 1060 high-risk neuroblastoma patients and 1345 diffuse intrinsic pontine glioma (DIPG) patients. This sample size can be lowered by reducing the number of variables included in the model, by including direct measures of the outcome to be predicted and/or by increasing the follow-up period. For model validation, the estimated sample size resulted to be 326 patients for neuroblastoma, 246 for high-risk neuroblastoma, and 592 for DIPG. CONCLUSIONS: Given the variability of the different sample sizes obtained, we recommend using methods based on epidemiological data and the nature of the results, as the results are tailored to the specific clinical problem. In addition, sample size can be reduced by lowering the number of parameter predictors, by including direct measures of the outcome of interest. Springer Vienna 2022-06-01 /pmc/articles/PMC9156610/ /pubmed/35641659 http://dx.doi.org/10.1186/s41747-022-00276-y Text en © The Author(s) under exclusive licence to European Society of Radiology 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Article
Baeza-Delgado, Carlos
Cerdá Alberich, Leonor
Carot-Sierra, José Miguel
Veiga-Canuto, Diana
Martínez de las Heras, Blanca
Raza, Ben
Martí-Bonmatí, Luis
A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
title A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
title_full A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
title_fullStr A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
title_full_unstemmed A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
title_short A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
title_sort practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9156610/
https://www.ncbi.nlm.nih.gov/pubmed/35641659
http://dx.doi.org/10.1186/s41747-022-00276-y
work_keys_str_mv AT baezadelgadocarlos apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT cerdaalberichleonor apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT carotsierrajosemiguel apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT veigacanutodiana apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT martinezdelasherasblanca apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT razaben apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT martibonmatiluis apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT baezadelgadocarlos practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT cerdaalberichleonor practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT carotsierrajosemiguel practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT veigacanutodiana practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT martinezdelasherasblanca practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT razaben practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata
AT martibonmatiluis practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata