Cargando…

The importance of making testable predictions: A cautionary tale

We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Eve...

Descripción completa

Detalles Bibliográficos
Autores principales: Choi, Emma S., Saberski, Erik, Lorimer, Tom, Smith, Cameron, Kandage-don, Unduwap, Burton, Ronald S., Sugihara, George
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7723288/
https://www.ncbi.nlm.nih.gov/pubmed/33290401
http://dx.doi.org/10.1371/journal.pone.0236541
_version_ 1783620314061602816
author Choi, Emma S.
Saberski, Erik
Lorimer, Tom
Smith, Cameron
Kandage-don, Unduwap
Burton, Ronald S.
Sugihara, George
author_facet Choi, Emma S.
Saberski, Erik
Lorimer, Tom
Smith, Cameron
Kandage-don, Unduwap
Burton, Ronald S.
Sugihara, George
author_sort Choi, Emma S.
collection PubMed
description We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising was that this event-based result persisted despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate potential over-fitting, we made an out-of-sample prediction beyond the publication process for the peak summer egg abundance observed at Scripps Pier in 2020 (available on bioRxiv). During peer-review, the prediction failed, and while it would be tempting to explain this away as a result of the record-breaking toxic algal bloom that occurred during the spring (9x higher concentration of dinoflagellates than ever previously recorded), a re-examination of our methodology revealed a potential source of over-fitting that had not been evaluated for robustness. This cautionary tale highlights the importance of testable true out-of-sample predictions of future values that cannot (even accidentally) be used in model fitting, and that can therefore catch model assumptions that may otherwise escape notice. We believe that this example can benefit the current push towards ecology as a predictive science and support the notion that predictions should live and die in the public domain, along with the models that made them.
format Online
Article
Text
id pubmed-7723288
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77232882020-12-16 The importance of making testable predictions: A cautionary tale Choi, Emma S. Saberski, Erik Lorimer, Tom Smith, Cameron Kandage-don, Unduwap Burton, Ronald S. Sugihara, George PLoS One Research Article We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising was that this event-based result persisted despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate potential over-fitting, we made an out-of-sample prediction beyond the publication process for the peak summer egg abundance observed at Scripps Pier in 2020 (available on bioRxiv). During peer-review, the prediction failed, and while it would be tempting to explain this away as a result of the record-breaking toxic algal bloom that occurred during the spring (9x higher concentration of dinoflagellates than ever previously recorded), a re-examination of our methodology revealed a potential source of over-fitting that had not been evaluated for robustness. This cautionary tale highlights the importance of testable true out-of-sample predictions of future values that cannot (even accidentally) be used in model fitting, and that can therefore catch model assumptions that may otherwise escape notice. We believe that this example can benefit the current push towards ecology as a predictive science and support the notion that predictions should live and die in the public domain, along with the models that made them. Public Library of Science 2020-12-08 /pmc/articles/PMC7723288/ /pubmed/33290401 http://dx.doi.org/10.1371/journal.pone.0236541 Text en © 2020 Choi et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Choi, Emma S.
Saberski, Erik
Lorimer, Tom
Smith, Cameron
Kandage-don, Unduwap
Burton, Ronald S.
Sugihara, George
The importance of making testable predictions: A cautionary tale
title The importance of making testable predictions: A cautionary tale
title_full The importance of making testable predictions: A cautionary tale
title_fullStr The importance of making testable predictions: A cautionary tale
title_full_unstemmed The importance of making testable predictions: A cautionary tale
title_short The importance of making testable predictions: A cautionary tale
title_sort importance of making testable predictions: a cautionary tale
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7723288/
https://www.ncbi.nlm.nih.gov/pubmed/33290401
http://dx.doi.org/10.1371/journal.pone.0236541
work_keys_str_mv AT choiemmas theimportanceofmakingtestablepredictionsacautionarytale
AT saberskierik theimportanceofmakingtestablepredictionsacautionarytale
AT lorimertom theimportanceofmakingtestablepredictionsacautionarytale
AT smithcameron theimportanceofmakingtestablepredictionsacautionarytale
AT kandagedonunduwap theimportanceofmakingtestablepredictionsacautionarytale
AT burtonronalds theimportanceofmakingtestablepredictionsacautionarytale
AT sugiharageorge theimportanceofmakingtestablepredictionsacautionarytale
AT choiemmas importanceofmakingtestablepredictionsacautionarytale
AT saberskierik importanceofmakingtestablepredictionsacautionarytale
AT lorimertom importanceofmakingtestablepredictionsacautionarytale
AT smithcameron importanceofmakingtestablepredictionsacautionarytale
AT kandagedonunduwap importanceofmakingtestablepredictionsacautionarytale
AT burtonronalds importanceofmakingtestablepredictionsacautionarytale
AT sugiharageorge importanceofmakingtestablepredictionsacautionarytale