Cargando…

An Unsupervised Algorithm for Host Identification in Flaviviruses

Early characterization of emerging viruses is essential to control their spread, such as the Zika Virus outbreak in 2014. Among other non-viral factors, host information is essential for the surveillance and control of virus spread. Flaviviruses (genus Flavivirus), akin to other viruses, are modulat...

Descripción completa

Detalles Bibliográficos
Autores principales: Truong Nguyen, Phuoc, Garcia-Vallvé, Santiago, Puigbò, Pere
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8157105/
https://www.ncbi.nlm.nih.gov/pubmed/34069049
http://dx.doi.org/10.3390/life11050442
_version_ 1783699605892890624
author Truong Nguyen, Phuoc
Garcia-Vallvé, Santiago
Puigbò, Pere
author_facet Truong Nguyen, Phuoc
Garcia-Vallvé, Santiago
Puigbò, Pere
author_sort Truong Nguyen, Phuoc
collection PubMed
description Early characterization of emerging viruses is essential to control their spread, such as the Zika Virus outbreak in 2014. Among other non-viral factors, host information is essential for the surveillance and control of virus spread. Flaviviruses (genus Flavivirus), akin to other viruses, are modulated by high mutation rates and selective forces to adapt their codon usage to that of their hosts. However, a major challenge is the identification of potential hosts for novel viruses. Usually, potential hosts of emerging zoonotic viruses are identified after several confirmed cases. This is inefficient for deterring future outbreaks. In this paper, we introduce an algorithm to identify the host range of a virus from its raw genome sequences. The proposed strategy relies on comparing codon usage frequencies across viruses and hosts, by means of a normalized Codon Adaptation Index (CAI). We have tested our algorithm on 94 flaviviruses and 16 potential hosts. This novel method is able to distinguish between arthropod and vertebrate hosts for several flaviviruses with high values of accuracy (virus group 91.9% and host type 86.1%) and specificity (virus group 94.9% and host type 79.6%), in comparison to empirical observations. Overall, this algorithm may be useful as a complementary tool to current phylogenetic methods in monitoring current and future viral outbreaks by understanding host–virus relationships.
format Online
Article
Text
id pubmed-8157105
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81571052021-05-28 An Unsupervised Algorithm for Host Identification in Flaviviruses Truong Nguyen, Phuoc Garcia-Vallvé, Santiago Puigbò, Pere Life (Basel) Communication Early characterization of emerging viruses is essential to control their spread, such as the Zika Virus outbreak in 2014. Among other non-viral factors, host information is essential for the surveillance and control of virus spread. Flaviviruses (genus Flavivirus), akin to other viruses, are modulated by high mutation rates and selective forces to adapt their codon usage to that of their hosts. However, a major challenge is the identification of potential hosts for novel viruses. Usually, potential hosts of emerging zoonotic viruses are identified after several confirmed cases. This is inefficient for deterring future outbreaks. In this paper, we introduce an algorithm to identify the host range of a virus from its raw genome sequences. The proposed strategy relies on comparing codon usage frequencies across viruses and hosts, by means of a normalized Codon Adaptation Index (CAI). We have tested our algorithm on 94 flaviviruses and 16 potential hosts. This novel method is able to distinguish between arthropod and vertebrate hosts for several flaviviruses with high values of accuracy (virus group 91.9% and host type 86.1%) and specificity (virus group 94.9% and host type 79.6%), in comparison to empirical observations. Overall, this algorithm may be useful as a complementary tool to current phylogenetic methods in monitoring current and future viral outbreaks by understanding host–virus relationships. MDPI 2021-05-14 /pmc/articles/PMC8157105/ /pubmed/34069049 http://dx.doi.org/10.3390/life11050442 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Communication
Truong Nguyen, Phuoc
Garcia-Vallvé, Santiago
Puigbò, Pere
An Unsupervised Algorithm for Host Identification in Flaviviruses
title An Unsupervised Algorithm for Host Identification in Flaviviruses
title_full An Unsupervised Algorithm for Host Identification in Flaviviruses
title_fullStr An Unsupervised Algorithm for Host Identification in Flaviviruses
title_full_unstemmed An Unsupervised Algorithm for Host Identification in Flaviviruses
title_short An Unsupervised Algorithm for Host Identification in Flaviviruses
title_sort unsupervised algorithm for host identification in flaviviruses
topic Communication
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8157105/
https://www.ncbi.nlm.nih.gov/pubmed/34069049
http://dx.doi.org/10.3390/life11050442
work_keys_str_mv AT truongnguyenphuoc anunsupervisedalgorithmforhostidentificationinflaviviruses
AT garciavallvesantiago anunsupervisedalgorithmforhostidentificationinflaviviruses
AT puigbopere anunsupervisedalgorithmforhostidentificationinflaviviruses
AT truongnguyenphuoc unsupervisedalgorithmforhostidentificationinflaviviruses
AT garciavallvesantiago unsupervisedalgorithmforhostidentificationinflaviviruses
AT puigbopere unsupervisedalgorithmforhostidentificationinflaviviruses