Cargando…

Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models

BACKGROUND: The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected ove...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Dianbo, Clemente, Leonardo, Poirier, Canelle, Ding, Xiyu, Chinazzi, Matteo, Davis, Jessica, Vespignani, Alessandro, Santillana, Mauricio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7459435/
https://www.ncbi.nlm.nih.gov/pubmed/32730217
http://dx.doi.org/10.2196/20285
_version_ 1783576373814624256
author Liu, Dianbo
Clemente, Leonardo
Poirier, Canelle
Ding, Xiyu
Chinazzi, Matteo
Davis, Jessica
Vespignani, Alessandro
Santillana, Mauricio
author_facet Liu, Dianbo
Clemente, Leonardo
Poirier, Canelle
Ding, Xiyu
Chinazzi, Matteo
Davis, Jessica
Vespignani, Alessandro
Santillana, Mauricio
author_sort Liu, Dianbo
collection PubMed
description BACKGROUND: The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. OBJECTIVE: We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. METHODS: Our method uses the following as inputs: (a) official health reports, (b) COVID-19–related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. RESULTS: Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. CONCLUSIONS: Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention.
format Online
Article
Text
id pubmed-7459435
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-74594352020-09-03 Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models Liu, Dianbo Clemente, Leonardo Poirier, Canelle Ding, Xiyu Chinazzi, Matteo Davis, Jessica Vespignani, Alessandro Santillana, Mauricio J Med Internet Res Original Paper BACKGROUND: The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. OBJECTIVE: We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. METHODS: Our method uses the following as inputs: (a) official health reports, (b) COVID-19–related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. RESULTS: Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. CONCLUSIONS: Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention. JMIR Publications 2020-08-17 /pmc/articles/PMC7459435/ /pubmed/32730217 http://dx.doi.org/10.2196/20285 Text en ©Dianbo Liu, Leonardo Clemente, Canelle Poirier, Xiyu Ding, Matteo Chinazzi, Jessica Davis, Alessandro Vespignani, Mauricio Santillana. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 17.08.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Liu, Dianbo
Clemente, Leonardo
Poirier, Canelle
Ding, Xiyu
Chinazzi, Matteo
Davis, Jessica
Vespignani, Alessandro
Santillana, Mauricio
Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
title Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
title_full Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
title_fullStr Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
title_full_unstemmed Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
title_short Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
title_sort real-time forecasting of the covid-19 outbreak in chinese provinces: machine learning approach using novel digital data and estimates from mechanistic models
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7459435/
https://www.ncbi.nlm.nih.gov/pubmed/32730217
http://dx.doi.org/10.2196/20285
work_keys_str_mv AT liudianbo realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT clementeleonardo realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT poiriercanelle realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT dingxiyu realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT chinazzimatteo realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT davisjessica realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT vespignanialessandro realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels
AT santillanamauricio realtimeforecastingofthecovid19outbreakinchineseprovincesmachinelearningapproachusingnoveldigitaldataandestimatesfrommechanisticmodels