Cargando…
Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds
Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a sma...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653378/ https://www.ncbi.nlm.nih.gov/pubmed/33081645 http://dx.doi.org/10.1098/rsif.2020.0419 |
_version_ | 1783607893377941504 |
---|---|
author | Stumpf, Michael P. H. |
author_facet | Stumpf, Michael P. H. |
author_sort | Stumpf, Michael P. H. |
collection | PubMed |
description | Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number—typically less than 10—of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble—choosing good predictors—is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided. |
format | Online Article Text |
id | pubmed-7653378 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | The Royal Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-76533782020-11-17 Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds Stumpf, Michael P. H. J R Soc Interface Life Sciences–Mathematics interface Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number—typically less than 10—of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble—choosing good predictors—is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided. The Royal Society 2020-10 2020-10-21 /pmc/articles/PMC7653378/ /pubmed/33081645 http://dx.doi.org/10.1098/rsif.2020.0419 Text en © 2020 The Authors. http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/http://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. |
spellingShingle | Life Sciences–Mathematics interface Stumpf, Michael P. H. Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
title | Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
title_full | Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
title_fullStr | Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
title_full_unstemmed | Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
title_short | Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
title_sort | multi-model and network inference based on ensemble estimates: avoiding the madness of crowds |
topic | Life Sciences–Mathematics interface |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653378/ https://www.ncbi.nlm.nih.gov/pubmed/33081645 http://dx.doi.org/10.1098/rsif.2020.0419 |
work_keys_str_mv | AT stumpfmichaelph multimodelandnetworkinferencebasedonensembleestimatesavoidingthemadnessofcrowds |