Cargando…

PREHOST: Host prediction of coronaviridae family using machine learning

Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline PreHost for host prediction of the family...

Descripción completa

Detalles Bibliográficos
Autores principales: Chaturvedi, Anusha, Borkar, Kushal, Priyakumar, U Deva, Vinod, P.K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9922161/
https://www.ncbi.nlm.nih.gov/pubmed/36816252
http://dx.doi.org/10.1016/j.heliyon.2023.e13646
Descripción
Sumario:Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline PreHost for host prediction of the family, Coronaviridae. We leverage the complete viral genome and sequences at the protein level (spike protein, membrane protein, and nucleocapsid protein). Compared with the current state-of-the-art approaches, the random forest model attained high accuracy and recall scores of 99.91% and 0.98, respectively, for genome sequences. In addition to the spike protein sequences, our study shows membrane and nucleocapsid protein sequences can be utilized to predict the host of viruses. We also identified important sites in the viral sequences that help distinguish between different host classes. The host prediction pipeline PreHost will cater as a valuable tool to take effective measures to govern the transmission of future viruses.