Cargando…

Prediction of pandemic risk for animal-origin coronavirus using a deep learning method

BACKGROUND: Coronaviruses can be isolated from bats, civets, pangolins, birds and other wild animals. As an animal-origin pathogen, coronavirus can cross species barrier and cause pandemic in humans. In this study, a deep learning model for early prediction of pandemic risk was proposed based on the...

Descripción completa

Detalles Bibliográficos
Autores principales: Kou, Zheng, Huang, Yi-Fan, Shen, Ao, Kosari, Saeed, Liu, Xiang-Rong, Qiang, Xiao-Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8542360/
https://www.ncbi.nlm.nih.gov/pubmed/34689829
http://dx.doi.org/10.1186/s40249-021-00912-6
_version_ 1784589414462128128
author Kou, Zheng
Huang, Yi-Fan
Shen, Ao
Kosari, Saeed
Liu, Xiang-Rong
Qiang, Xiao-Li
author_facet Kou, Zheng
Huang, Yi-Fan
Shen, Ao
Kosari, Saeed
Liu, Xiang-Rong
Qiang, Xiao-Li
author_sort Kou, Zheng
collection PubMed
description BACKGROUND: Coronaviruses can be isolated from bats, civets, pangolins, birds and other wild animals. As an animal-origin pathogen, coronavirus can cross species barrier and cause pandemic in humans. In this study, a deep learning model for early prediction of pandemic risk was proposed based on the sequences of viral genomes. METHODS: A total of 3257 genomes were downloaded from the Coronavirus Genome Resource Library. We present a deep learning model of cross-species coronavirus infection that combines a bidirectional gated recurrent unit network with a one-dimensional convolution. The genome sequence of animal-origin coronavirus was directly input to extract features and predict pandemic risk. The best performances were explored with the use of pre-trained DNA vector and attention mechanism. The area under the receiver operating characteristic curve (AUROC) and the area under precision-recall curve (AUPR) were used to evaluate the predictive models. RESULTS: The six specific models achieved good performances for the corresponding virus groups (1 for AUROC and 1 for AUPR). The general model with pre-training vector and attention mechanism provided excellent predictions for all virus groups (1 for AUROC and 1 for AUPR) while those without pre-training vector or attention mechanism had obviously reduction of performance (about 5–25%). Re-training experiments showed that the general model has good capabilities of transfer learning (average for six groups: 0.968 for AUROC and 0.942 for AUPR) and should give reasonable prediction for potential pathogen of next pandemic. The artificial negative data with the replacement of the coding region of the spike protein were also predicted correctly (100% accuracy). With the application of the Python programming language, an easy-to-use tool was created to implements our predictor. CONCLUSIONS: Robust deep learning model with pre-training vector and attention mechanism mastered the features from the whole genomes of animal-origin coronaviruses and could predict the risk of cross-species infection for early warning of next pandemic. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40249-021-00912-6.
format Online
Article
Text
id pubmed-8542360
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85423602021-10-25 Prediction of pandemic risk for animal-origin coronavirus using a deep learning method Kou, Zheng Huang, Yi-Fan Shen, Ao Kosari, Saeed Liu, Xiang-Rong Qiang, Xiao-Li Infect Dis Poverty Research Article BACKGROUND: Coronaviruses can be isolated from bats, civets, pangolins, birds and other wild animals. As an animal-origin pathogen, coronavirus can cross species barrier and cause pandemic in humans. In this study, a deep learning model for early prediction of pandemic risk was proposed based on the sequences of viral genomes. METHODS: A total of 3257 genomes were downloaded from the Coronavirus Genome Resource Library. We present a deep learning model of cross-species coronavirus infection that combines a bidirectional gated recurrent unit network with a one-dimensional convolution. The genome sequence of animal-origin coronavirus was directly input to extract features and predict pandemic risk. The best performances were explored with the use of pre-trained DNA vector and attention mechanism. The area under the receiver operating characteristic curve (AUROC) and the area under precision-recall curve (AUPR) were used to evaluate the predictive models. RESULTS: The six specific models achieved good performances for the corresponding virus groups (1 for AUROC and 1 for AUPR). The general model with pre-training vector and attention mechanism provided excellent predictions for all virus groups (1 for AUROC and 1 for AUPR) while those without pre-training vector or attention mechanism had obviously reduction of performance (about 5–25%). Re-training experiments showed that the general model has good capabilities of transfer learning (average for six groups: 0.968 for AUROC and 0.942 for AUPR) and should give reasonable prediction for potential pathogen of next pandemic. The artificial negative data with the replacement of the coding region of the spike protein were also predicted correctly (100% accuracy). With the application of the Python programming language, an easy-to-use tool was created to implements our predictor. CONCLUSIONS: Robust deep learning model with pre-training vector and attention mechanism mastered the features from the whole genomes of animal-origin coronaviruses and could predict the risk of cross-species infection for early warning of next pandemic. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40249-021-00912-6. BioMed Central 2021-10-24 /pmc/articles/PMC8542360/ /pubmed/34689829 http://dx.doi.org/10.1186/s40249-021-00912-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Kou, Zheng
Huang, Yi-Fan
Shen, Ao
Kosari, Saeed
Liu, Xiang-Rong
Qiang, Xiao-Li
Prediction of pandemic risk for animal-origin coronavirus using a deep learning method
title Prediction of pandemic risk for animal-origin coronavirus using a deep learning method
title_full Prediction of pandemic risk for animal-origin coronavirus using a deep learning method
title_fullStr Prediction of pandemic risk for animal-origin coronavirus using a deep learning method
title_full_unstemmed Prediction of pandemic risk for animal-origin coronavirus using a deep learning method
title_short Prediction of pandemic risk for animal-origin coronavirus using a deep learning method
title_sort prediction of pandemic risk for animal-origin coronavirus using a deep learning method
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8542360/
https://www.ncbi.nlm.nih.gov/pubmed/34689829
http://dx.doi.org/10.1186/s40249-021-00912-6
work_keys_str_mv AT kouzheng predictionofpandemicriskforanimalorigincoronavirususingadeeplearningmethod
AT huangyifan predictionofpandemicriskforanimalorigincoronavirususingadeeplearningmethod
AT shenao predictionofpandemicriskforanimalorigincoronavirususingadeeplearningmethod
AT kosarisaeed predictionofpandemicriskforanimalorigincoronavirususingadeeplearningmethod
AT liuxiangrong predictionofpandemicriskforanimalorigincoronavirususingadeeplearningmethod
AT qiangxiaoli predictionofpandemicriskforanimalorigincoronavirususingadeeplearningmethod