Cargando…
Power and reproducibility in the external validation of brain-phenotype predictions
Identifying reproducible and generalizable brain-phenotype associations is a central goal of neuroimaging. Consistent with this goal, prediction frameworks evaluate brain-phenotype models in unseen data. Most prediction studies train and evaluate a model in the same dataset. However, external valida...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634903/ https://www.ncbi.nlm.nih.gov/pubmed/37961654 http://dx.doi.org/10.1101/2023.10.25.563971 |
_version_ | 1785146258695913472 |
---|---|
author | Rosenblatt, Matthew Tejavibulya, Link Camp, Chris C. Jiang, Rongtao Westwater, Margaret L. Noble, Stephanie Scheinost, Dustin |
author_facet | Rosenblatt, Matthew Tejavibulya, Link Camp, Chris C. Jiang, Rongtao Westwater, Margaret L. Noble, Stephanie Scheinost, Dustin |
author_sort | Rosenblatt, Matthew |
collection | PubMed |
description | Identifying reproducible and generalizable brain-phenotype associations is a central goal of neuroimaging. Consistent with this goal, prediction frameworks evaluate brain-phenotype models in unseen data. Most prediction studies train and evaluate a model in the same dataset. However, external validation, or the evaluation of a model in an external dataset, provides a better assessment of robustness and generalizability. Despite the promise of external validation and calls for its usage, the statistical power of such studies has yet to be investigated. In this work, we ran over 60 million simulations across several datasets, phenotypes, and sample sizes to better understand how the sizes of the training and external datasets affect statistical power. We found that prior external validation studies used sample sizes prone to low power, which may lead to false negatives and effect size inflation. Furthermore, increases in the external sample size led to increased simulated power directly following theoretical power curves, whereas changes in the training dataset size offset the simulated power curves. Finally, we compared the performance of a model within a dataset to the external performance. The within-dataset performance was typically within r=0.2 of the cross-dataset performance, which could help decide how to power future external validation studies. Overall, our results illustrate the importance of considering the sample sizes of both the training and external datasets when performing external validation. |
format | Online Article Text |
id | pubmed-10634903 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-106349032023-11-13 Power and reproducibility in the external validation of brain-phenotype predictions Rosenblatt, Matthew Tejavibulya, Link Camp, Chris C. Jiang, Rongtao Westwater, Margaret L. Noble, Stephanie Scheinost, Dustin bioRxiv Article Identifying reproducible and generalizable brain-phenotype associations is a central goal of neuroimaging. Consistent with this goal, prediction frameworks evaluate brain-phenotype models in unseen data. Most prediction studies train and evaluate a model in the same dataset. However, external validation, or the evaluation of a model in an external dataset, provides a better assessment of robustness and generalizability. Despite the promise of external validation and calls for its usage, the statistical power of such studies has yet to be investigated. In this work, we ran over 60 million simulations across several datasets, phenotypes, and sample sizes to better understand how the sizes of the training and external datasets affect statistical power. We found that prior external validation studies used sample sizes prone to low power, which may lead to false negatives and effect size inflation. Furthermore, increases in the external sample size led to increased simulated power directly following theoretical power curves, whereas changes in the training dataset size offset the simulated power curves. Finally, we compared the performance of a model within a dataset to the external performance. The within-dataset performance was typically within r=0.2 of the cross-dataset performance, which could help decide how to power future external validation studies. Overall, our results illustrate the importance of considering the sample sizes of both the training and external datasets when performing external validation. Cold Spring Harbor Laboratory 2023-10-30 /pmc/articles/PMC10634903/ /pubmed/37961654 http://dx.doi.org/10.1101/2023.10.25.563971 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Rosenblatt, Matthew Tejavibulya, Link Camp, Chris C. Jiang, Rongtao Westwater, Margaret L. Noble, Stephanie Scheinost, Dustin Power and reproducibility in the external validation of brain-phenotype predictions |
title | Power and reproducibility in the external validation of brain-phenotype predictions |
title_full | Power and reproducibility in the external validation of brain-phenotype predictions |
title_fullStr | Power and reproducibility in the external validation of brain-phenotype predictions |
title_full_unstemmed | Power and reproducibility in the external validation of brain-phenotype predictions |
title_short | Power and reproducibility in the external validation of brain-phenotype predictions |
title_sort | power and reproducibility in the external validation of brain-phenotype predictions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634903/ https://www.ncbi.nlm.nih.gov/pubmed/37961654 http://dx.doi.org/10.1101/2023.10.25.563971 |
work_keys_str_mv | AT rosenblattmatthew powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions AT tejavibulyalink powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions AT campchrisc powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions AT jiangrongtao powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions AT westwatermargaretl powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions AT noblestephanie powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions AT scheinostdustin powerandreproducibilityintheexternalvalidationofbrainphenotypepredictions |