Cargando…

PREHOST: Host prediction of coronaviridae family using machine learning

Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline PreHost for host prediction of the family...

Descripción completa

Detalles Bibliográficos
Autores principales: Chaturvedi, Anusha, Borkar, Kushal, Priyakumar, U Deva, Vinod, P.K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9922161/
https://www.ncbi.nlm.nih.gov/pubmed/36816252
http://dx.doi.org/10.1016/j.heliyon.2023.e13646
_version_ 1784887483092172800
author Chaturvedi, Anusha
Borkar, Kushal
Priyakumar, U Deva
Vinod, P.K.
author_facet Chaturvedi, Anusha
Borkar, Kushal
Priyakumar, U Deva
Vinod, P.K.
author_sort Chaturvedi, Anusha
collection PubMed
description Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline PreHost for host prediction of the family, Coronaviridae. We leverage the complete viral genome and sequences at the protein level (spike protein, membrane protein, and nucleocapsid protein). Compared with the current state-of-the-art approaches, the random forest model attained high accuracy and recall scores of 99.91% and 0.98, respectively, for genome sequences. In addition to the spike protein sequences, our study shows membrane and nucleocapsid protein sequences can be utilized to predict the host of viruses. We also identified important sites in the viral sequences that help distinguish between different host classes. The host prediction pipeline PreHost will cater as a valuable tool to take effective measures to govern the transmission of future viruses.
format Online
Article
Text
id pubmed-9922161
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-99221612023-02-13 PREHOST: Host prediction of coronaviridae family using machine learning Chaturvedi, Anusha Borkar, Kushal Priyakumar, U Deva Vinod, P.K. Heliyon Research Article Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline PreHost for host prediction of the family, Coronaviridae. We leverage the complete viral genome and sequences at the protein level (spike protein, membrane protein, and nucleocapsid protein). Compared with the current state-of-the-art approaches, the random forest model attained high accuracy and recall scores of 99.91% and 0.98, respectively, for genome sequences. In addition to the spike protein sequences, our study shows membrane and nucleocapsid protein sequences can be utilized to predict the host of viruses. We also identified important sites in the viral sequences that help distinguish between different host classes. The host prediction pipeline PreHost will cater as a valuable tool to take effective measures to govern the transmission of future viruses. Elsevier 2023-02-11 /pmc/articles/PMC9922161/ /pubmed/36816252 http://dx.doi.org/10.1016/j.heliyon.2023.e13646 Text en © 2023 The Authors. Published by Elsevier Ltd. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Chaturvedi, Anusha
Borkar, Kushal
Priyakumar, U Deva
Vinod, P.K.
PREHOST: Host prediction of coronaviridae family using machine learning
title PREHOST: Host prediction of coronaviridae family using machine learning
title_full PREHOST: Host prediction of coronaviridae family using machine learning
title_fullStr PREHOST: Host prediction of coronaviridae family using machine learning
title_full_unstemmed PREHOST: Host prediction of coronaviridae family using machine learning
title_short PREHOST: Host prediction of coronaviridae family using machine learning
title_sort prehost: host prediction of coronaviridae family using machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9922161/
https://www.ncbi.nlm.nih.gov/pubmed/36816252
http://dx.doi.org/10.1016/j.heliyon.2023.e13646
work_keys_str_mv AT chaturvedianusha prehosthostpredictionofcoronaviridaefamilyusingmachinelearning
AT borkarkushal prehosthostpredictionofcoronaviridaefamilyusingmachinelearning
AT priyakumarudeva prehosthostpredictionofcoronaviridaefamilyusingmachinelearning
AT vinodpk prehosthostpredictionofcoronaviridaefamilyusingmachinelearning