Cargando…
A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data
BACKGROUND: Estimating the required sample size is crucial when developing and validating clinical prediction models. However, there is no consensus about how to determine the sample size in such a setting. Here, the goal was to compare available methods to define a practical solution to sample size...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Vienna
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9156610/ https://www.ncbi.nlm.nih.gov/pubmed/35641659 http://dx.doi.org/10.1186/s41747-022-00276-y |
_version_ | 1784718475955011584 |
---|---|
author | Baeza-Delgado, Carlos Cerdá Alberich, Leonor Carot-Sierra, José Miguel Veiga-Canuto, Diana Martínez de las Heras, Blanca Raza, Ben Martí-Bonmatí, Luis |
author_facet | Baeza-Delgado, Carlos Cerdá Alberich, Leonor Carot-Sierra, José Miguel Veiga-Canuto, Diana Martínez de las Heras, Blanca Raza, Ben Martí-Bonmatí, Luis |
author_sort | Baeza-Delgado, Carlos |
collection | PubMed |
description | BACKGROUND: Estimating the required sample size is crucial when developing and validating clinical prediction models. However, there is no consensus about how to determine the sample size in such a setting. Here, the goal was to compare available methods to define a practical solution to sample size estimation for clinical predictive models, as applied to Horizon 2020 PRIMAGE as a case study. METHODS: Three different methods (Riley’s; “rule of thumb” with 10 and 5 events per predictor) were employed to calculate the sample size required to develop predictive models to analyse the variation in sample size as a function of different parameters. Subsequently, the sample size for model validation was also estimated. RESULTS: To develop reliable predictive models, 1397 neuroblastoma patients are required, 1060 high-risk neuroblastoma patients and 1345 diffuse intrinsic pontine glioma (DIPG) patients. This sample size can be lowered by reducing the number of variables included in the model, by including direct measures of the outcome to be predicted and/or by increasing the follow-up period. For model validation, the estimated sample size resulted to be 326 patients for neuroblastoma, 246 for high-risk neuroblastoma, and 592 for DIPG. CONCLUSIONS: Given the variability of the different sample sizes obtained, we recommend using methods based on epidemiological data and the nature of the results, as the results are tailored to the specific clinical problem. In addition, sample size can be reduced by lowering the number of parameter predictors, by including direct measures of the outcome of interest. |
format | Online Article Text |
id | pubmed-9156610 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Vienna |
record_format | MEDLINE/PubMed |
spelling | pubmed-91566102022-06-02 A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data Baeza-Delgado, Carlos Cerdá Alberich, Leonor Carot-Sierra, José Miguel Veiga-Canuto, Diana Martínez de las Heras, Blanca Raza, Ben Martí-Bonmatí, Luis Eur Radiol Exp Original Article BACKGROUND: Estimating the required sample size is crucial when developing and validating clinical prediction models. However, there is no consensus about how to determine the sample size in such a setting. Here, the goal was to compare available methods to define a practical solution to sample size estimation for clinical predictive models, as applied to Horizon 2020 PRIMAGE as a case study. METHODS: Three different methods (Riley’s; “rule of thumb” with 10 and 5 events per predictor) were employed to calculate the sample size required to develop predictive models to analyse the variation in sample size as a function of different parameters. Subsequently, the sample size for model validation was also estimated. RESULTS: To develop reliable predictive models, 1397 neuroblastoma patients are required, 1060 high-risk neuroblastoma patients and 1345 diffuse intrinsic pontine glioma (DIPG) patients. This sample size can be lowered by reducing the number of variables included in the model, by including direct measures of the outcome to be predicted and/or by increasing the follow-up period. For model validation, the estimated sample size resulted to be 326 patients for neuroblastoma, 246 for high-risk neuroblastoma, and 592 for DIPG. CONCLUSIONS: Given the variability of the different sample sizes obtained, we recommend using methods based on epidemiological data and the nature of the results, as the results are tailored to the specific clinical problem. In addition, sample size can be reduced by lowering the number of parameter predictors, by including direct measures of the outcome of interest. Springer Vienna 2022-06-01 /pmc/articles/PMC9156610/ /pubmed/35641659 http://dx.doi.org/10.1186/s41747-022-00276-y Text en © The Author(s) under exclusive licence to European Society of Radiology 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Original Article Baeza-Delgado, Carlos Cerdá Alberich, Leonor Carot-Sierra, José Miguel Veiga-Canuto, Diana Martínez de las Heras, Blanca Raza, Ben Martí-Bonmatí, Luis A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
title | A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
title_full | A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
title_fullStr | A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
title_full_unstemmed | A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
title_short | A practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
title_sort | practical solution to estimate the sample size required for clinical prediction models generated from observational research on data |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9156610/ https://www.ncbi.nlm.nih.gov/pubmed/35641659 http://dx.doi.org/10.1186/s41747-022-00276-y |
work_keys_str_mv | AT baezadelgadocarlos apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT cerdaalberichleonor apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT carotsierrajosemiguel apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT veigacanutodiana apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT martinezdelasherasblanca apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT razaben apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT martibonmatiluis apracticalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT baezadelgadocarlos practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT cerdaalberichleonor practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT carotsierrajosemiguel practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT veigacanutodiana practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT martinezdelasherasblanca practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT razaben practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata AT martibonmatiluis practicalsolutiontoestimatethesamplesizerequiredforclinicalpredictionmodelsgeneratedfromobservationalresearchondata |