Cargando…

Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds

Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a sma...

Descripción completa

Detalles Bibliográficos
Autor principal: Stumpf, Michael P. H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653378/
https://www.ncbi.nlm.nih.gov/pubmed/33081645
http://dx.doi.org/10.1098/rsif.2020.0419
_version_ 1783607893377941504
author Stumpf, Michael P. H.
author_facet Stumpf, Michael P. H.
author_sort Stumpf, Michael P. H.
collection PubMed
description Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number—typically less than 10—of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble—choosing good predictors—is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided.
format Online
Article
Text
id pubmed-7653378
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher The Royal Society
record_format MEDLINE/PubMed
spelling pubmed-76533782020-11-17 Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds Stumpf, Michael P. H. J R Soc Interface Life Sciences–Mathematics interface Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number—typically less than 10—of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble—choosing good predictors—is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided. The Royal Society 2020-10 2020-10-21 /pmc/articles/PMC7653378/ /pubmed/33081645 http://dx.doi.org/10.1098/rsif.2020.0419 Text en © 2020 The Authors. http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/http://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle Life Sciences–Mathematics interface
Stumpf, Michael P. H.
Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
title Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
title_full Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
title_fullStr Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
title_full_unstemmed Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
title_short Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
title_sort multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
topic Life Sciences–Mathematics interface
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653378/
https://www.ncbi.nlm.nih.gov/pubmed/33081645
http://dx.doi.org/10.1098/rsif.2020.0419
work_keys_str_mv AT stumpfmichaelph multimodelandnetworkinferencebasedonensembleestimatesavoidingthemadnessofcrowds