Cargando…
An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology
Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562867/ https://www.ncbi.nlm.nih.gov/pubmed/37822674 http://dx.doi.org/10.1016/j.mex.2023.102382 |
_version_ | 1785118225394040832 |
---|---|
author | Lai, Mallory Wulff, Shaun S. Cao, Yongtao Robinson, Timothy J. Rajapaksha, Rasika |
author_facet | Lai, Mallory Wulff, Shaun S. Cao, Yongtao Robinson, Timothy J. Rajapaksha, Rasika |
author_sort | Lai, Mallory |
collection | PubMed |
description | Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to create an interpretable, hypothesis-driven framework for machine learning that can handle different nowcast and forecast lengths. Some of the techniques employed include: • Feature engineering to construct interpretable features, like site-specific lead times, hypothesized to be potential predictors of COVID-19 cases. • Feature selection to identify features with the best predictive performance for the tasks of nowcasting and forecasting. • Prequential evaluation to prevent data leakage while evaluating the performance of the machine learning algorithm. |
format | Online Article Text |
id | pubmed-10562867 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-105628672023-10-11 An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology Lai, Mallory Wulff, Shaun S. Cao, Yongtao Robinson, Timothy J. Rajapaksha, Rasika MethodsX Bioinformatics Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to create an interpretable, hypothesis-driven framework for machine learning that can handle different nowcast and forecast lengths. Some of the techniques employed include: • Feature engineering to construct interpretable features, like site-specific lead times, hypothesized to be potential predictors of COVID-19 cases. • Feature selection to identify features with the best predictive performance for the tasks of nowcasting and forecasting. • Prequential evaluation to prevent data leakage while evaluating the performance of the machine learning algorithm. Elsevier 2023-09-27 /pmc/articles/PMC10562867/ /pubmed/37822674 http://dx.doi.org/10.1016/j.mex.2023.102382 Text en © 2023 Published by Elsevier B.V. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Bioinformatics Lai, Mallory Wulff, Shaun S. Cao, Yongtao Robinson, Timothy J. Rajapaksha, Rasika An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
title | An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
title_full | An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
title_fullStr | An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
title_full_unstemmed | An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
title_short | An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
title_sort | interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562867/ https://www.ncbi.nlm.nih.gov/pubmed/37822674 http://dx.doi.org/10.1016/j.mex.2023.102382 |
work_keys_str_mv | AT laimallory aninterpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT wulffshauns aninterpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT caoyongtao aninterpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT robinsontimothyj aninterpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT rajapaksharasika aninterpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT laimallory interpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT wulffshauns interpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT caoyongtao interpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT robinsontimothyj interpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology AT rajapaksharasika interpretabletimeseriesmachinelearningmethodforvaryingforecastandnowcastlengthsinwastewaterbasedepidemiology |