Cargando…

Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network

Introduction: Zoonotic transition of Influenza A viruses is the cause of epidemics with high rates of morbidity and mortality. Predicting which viral strains are likely to transition from their genetic sequence could help in the prevention and response against these zoonotic strains. We hypothesized...

Descripción completa

Detalles Bibliográficos
Autores principales: Hatibi, Nissrine, Dumont-Lagacé, Maude, Alouani, Zakaria, El Fatimy, Rachid, Abik, Mounia, Daouda, Tariq
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10415530/
https://www.ncbi.nlm.nih.gov/pubmed/37576548
http://dx.doi.org/10.3389/fgene.2023.1145166
_version_ 1785087561827352576
author Hatibi, Nissrine
Dumont-Lagacé, Maude
Alouani, Zakaria
El Fatimy, Rachid
Abik, Mounia
Daouda, Tariq
author_facet Hatibi, Nissrine
Dumont-Lagacé, Maude
Alouani, Zakaria
El Fatimy, Rachid
Abik, Mounia
Daouda, Tariq
author_sort Hatibi, Nissrine
collection PubMed
description Introduction: Zoonotic transition of Influenza A viruses is the cause of epidemics with high rates of morbidity and mortality. Predicting which viral strains are likely to transition from their genetic sequence could help in the prevention and response against these zoonotic strains. We hypothesized that features predictive of viral hosts could be leveraged to identify biomarkers of zoonotic viral transition. Methods: We trained deep learning models to predict viral hosts based on the virus mRNA or protein sequences. Our multi-host dataset contained 848,630 unique nucleotide sequences obtained from the NCBI Influenza Virus and Influenza Research Databases. Each sequence, representing one gene from one viral strain, was classified into one of the three host categories: Avian, Human, and Swine. Trained models were analyzed using various neural network interpretation methods to identify interesting candidates for zoonotic transition biomarkers. Results: Using mRNA sequences as input led to higher prediction accuracies than amino acids, suggesting that the codon sequence contains information relevant to viral hosts that is lost during protein translation. UMAP visualization of the latent space of our classifiers showed that viral sequences clustered according to their host of origin. Interestingly, sequences from pandemic zoonotic viral strains localized at the margins between hosts, while zoonotic sequences incapable of Human-to-Human transmission localized with non-zoonotic viruses from the same host. In addition, host prediction for pandemic zoonotic sequences had low prediction accuracy, which was not the case for the other zoonotic strains. This supports our hypothesis that ambiguously predicted viral sequences bear features associated with cross-species infectivity. Finally, we compared misclassified sequences to well-classified ones to extract interesting candidates for zoonotic transition biomarkers. While features varied significantly between pairs of species and viral genes, several codons were conserved in Swine-to-Human and Avian-to-Human misclassified sequences, and in particular in the NA, HA, and NP genes, suggesting their importance for zoonosis in Humans. Discussion: Analysis of viral sequences using neural network interpretation approaches revealed important genetic differences between zoonotic viruses with pandemic potential, compared to non-zoonotic viral strains or zoonotic viruses incapable of Human-to-Human transmission.
format Online
Article
Text
id pubmed-10415530
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-104155302023-08-12 Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network Hatibi, Nissrine Dumont-Lagacé, Maude Alouani, Zakaria El Fatimy, Rachid Abik, Mounia Daouda, Tariq Front Genet Genetics Introduction: Zoonotic transition of Influenza A viruses is the cause of epidemics with high rates of morbidity and mortality. Predicting which viral strains are likely to transition from their genetic sequence could help in the prevention and response against these zoonotic strains. We hypothesized that features predictive of viral hosts could be leveraged to identify biomarkers of zoonotic viral transition. Methods: We trained deep learning models to predict viral hosts based on the virus mRNA or protein sequences. Our multi-host dataset contained 848,630 unique nucleotide sequences obtained from the NCBI Influenza Virus and Influenza Research Databases. Each sequence, representing one gene from one viral strain, was classified into one of the three host categories: Avian, Human, and Swine. Trained models were analyzed using various neural network interpretation methods to identify interesting candidates for zoonotic transition biomarkers. Results: Using mRNA sequences as input led to higher prediction accuracies than amino acids, suggesting that the codon sequence contains information relevant to viral hosts that is lost during protein translation. UMAP visualization of the latent space of our classifiers showed that viral sequences clustered according to their host of origin. Interestingly, sequences from pandemic zoonotic viral strains localized at the margins between hosts, while zoonotic sequences incapable of Human-to-Human transmission localized with non-zoonotic viruses from the same host. In addition, host prediction for pandemic zoonotic sequences had low prediction accuracy, which was not the case for the other zoonotic strains. This supports our hypothesis that ambiguously predicted viral sequences bear features associated with cross-species infectivity. Finally, we compared misclassified sequences to well-classified ones to extract interesting candidates for zoonotic transition biomarkers. While features varied significantly between pairs of species and viral genes, several codons were conserved in Swine-to-Human and Avian-to-Human misclassified sequences, and in particular in the NA, HA, and NP genes, suggesting their importance for zoonosis in Humans. Discussion: Analysis of viral sequences using neural network interpretation approaches revealed important genetic differences between zoonotic viruses with pandemic potential, compared to non-zoonotic viral strains or zoonotic viruses incapable of Human-to-Human transmission. Frontiers Media S.A. 2023-07-27 /pmc/articles/PMC10415530/ /pubmed/37576548 http://dx.doi.org/10.3389/fgene.2023.1145166 Text en Copyright © 2023 Hatibi, Dumont-Lagacé, Alouani, El Fatimy, Abik and Daouda. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Hatibi, Nissrine
Dumont-Lagacé, Maude
Alouani, Zakaria
El Fatimy, Rachid
Abik, Mounia
Daouda, Tariq
Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network
title Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network
title_full Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network
title_fullStr Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network
title_full_unstemmed Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network
title_short Misclassified: identification of zoonotic transition biomarker candidates for influenza A viruses using deep neural network
title_sort misclassified: identification of zoonotic transition biomarker candidates for influenza a viruses using deep neural network
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10415530/
https://www.ncbi.nlm.nih.gov/pubmed/37576548
http://dx.doi.org/10.3389/fgene.2023.1145166
work_keys_str_mv AT hatibinissrine misclassifiedidentificationofzoonotictransitionbiomarkercandidatesforinfluenzaavirusesusingdeepneuralnetwork
AT dumontlagacemaude misclassifiedidentificationofzoonotictransitionbiomarkercandidatesforinfluenzaavirusesusingdeepneuralnetwork
AT alouanizakaria misclassifiedidentificationofzoonotictransitionbiomarkercandidatesforinfluenzaavirusesusingdeepneuralnetwork
AT elfatimyrachid misclassifiedidentificationofzoonotictransitionbiomarkercandidatesforinfluenzaavirusesusingdeepneuralnetwork
AT abikmounia misclassifiedidentificationofzoonotictransitionbiomarkercandidatesforinfluenzaavirusesusingdeepneuralnetwork
AT daoudatariq misclassifiedidentificationofzoonotictransitionbiomarkercandidatesforinfluenzaavirusesusingdeepneuralnetwork