Cargando…

Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy

Extensive mutations in the Omicron spike protein appear to accelerate the transmission of SARS-CoV-2, and rapid infections increase the odds that additional mutants will emerge. To build an investigative framework, we have applied an unsupervised machine learning approach to 4296 Omicron viral genom...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Lue Ping, Lybrand, Terry, Gilbert, Peter, Payne, Thomas H., Pyo, Chul-Woo, Geraghty, Daniel, Jerome, Keith
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8887078/
https://www.ncbi.nlm.nih.gov/pubmed/35233566
http://dx.doi.org/10.21203/rs.3.rs-1280819/v1
_version_ 1784660816326295552
author Zhao, Lue Ping
Lybrand, Terry
Gilbert, Peter
Payne, Thomas H.
Pyo, Chul-Woo
Geraghty, Daniel
Jerome, Keith
author_facet Zhao, Lue Ping
Lybrand, Terry
Gilbert, Peter
Payne, Thomas H.
Pyo, Chul-Woo
Geraghty, Daniel
Jerome, Keith
author_sort Zhao, Lue Ping
collection PubMed
description Extensive mutations in the Omicron spike protein appear to accelerate the transmission of SARS-CoV-2, and rapid infections increase the odds that additional mutants will emerge. To build an investigative framework, we have applied an unsupervised machine learning approach to 4296 Omicron viral genomes collected and deposited to GISAID as of December 14, 2021, and have identified a core haplotype of 28 polymutants (A67V, T95I, G339D, R346K, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, K796Y, N856K, Q954H, N69K, L981F) in the spike protein and a separate core haplotype of 17 polymutants in non-spike genes: (K38, A1892) in nsp3, T492 in nsp4, (P132, V247, T280, S284) in 3C-like proteinase, I189 in nsp6, P323 in RNA-dependent RNA polymerase, I42 in Exonuclease, T9 in envelope protein, (D3, Q19, A63) in membrane glycoprotein, and (P13, R203, G204) in nucleocapsid phosphoprotein. Using these core haplotypes as reference, we have identified four newly emerging polymutants (R346, A701, I1081, N1192) in the spike protein (p-value=9.37*10(−4), 1.0*10(−15), 4.76*10(−7) and 1.56*10(−4), respectively), and five additional polymutants in non-spike genes (D343G in nucleocapsid phosphoprotein, V1069I in nsp3, V94A in nsp4, F694Y in the RNA-dependent RNA polymerase and L106L/F of ORF3a) that exhibit significant increasing trajectories (all p-values < 1.0*10(−15)). In the absence of relevant clinical data for these newly emerging mutations, it is important to monitor them closely. Two emerging mutations may be of particular concern: the N1192S mutation in spike protein locates in an extremely highly conserved region of all human coronaviruses that is integral to the viral fusion process, and the F694Y mutation in the RNA polymerase may induce conformational changes that could impact Remdesivir binding.
format Online
Article
Text
id pubmed-8887078
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-88870782022-03-02 Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy Zhao, Lue Ping Lybrand, Terry Gilbert, Peter Payne, Thomas H. Pyo, Chul-Woo Geraghty, Daniel Jerome, Keith Res Sq Article Extensive mutations in the Omicron spike protein appear to accelerate the transmission of SARS-CoV-2, and rapid infections increase the odds that additional mutants will emerge. To build an investigative framework, we have applied an unsupervised machine learning approach to 4296 Omicron viral genomes collected and deposited to GISAID as of December 14, 2021, and have identified a core haplotype of 28 polymutants (A67V, T95I, G339D, R346K, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, K796Y, N856K, Q954H, N69K, L981F) in the spike protein and a separate core haplotype of 17 polymutants in non-spike genes: (K38, A1892) in nsp3, T492 in nsp4, (P132, V247, T280, S284) in 3C-like proteinase, I189 in nsp6, P323 in RNA-dependent RNA polymerase, I42 in Exonuclease, T9 in envelope protein, (D3, Q19, A63) in membrane glycoprotein, and (P13, R203, G204) in nucleocapsid phosphoprotein. Using these core haplotypes as reference, we have identified four newly emerging polymutants (R346, A701, I1081, N1192) in the spike protein (p-value=9.37*10(−4), 1.0*10(−15), 4.76*10(−7) and 1.56*10(−4), respectively), and five additional polymutants in non-spike genes (D343G in nucleocapsid phosphoprotein, V1069I in nsp3, V94A in nsp4, F694Y in the RNA-dependent RNA polymerase and L106L/F of ORF3a) that exhibit significant increasing trajectories (all p-values < 1.0*10(−15)). In the absence of relevant clinical data for these newly emerging mutations, it is important to monitor them closely. Two emerging mutations may be of particular concern: the N1192S mutation in spike protein locates in an extremely highly conserved region of all human coronaviruses that is integral to the viral fusion process, and the F694Y mutation in the RNA polymerase may induce conformational changes that could impact Remdesivir binding. American Journal Experts 2022-02-25 /pmc/articles/PMC8887078/ /pubmed/35233566 http://dx.doi.org/10.21203/rs.3.rs-1280819/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. https://creativecommons.org/licenses/by/4.0/License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License (https://creativecommons.org/licenses/by/4.0/)
spellingShingle Article
Zhao, Lue Ping
Lybrand, Terry
Gilbert, Peter
Payne, Thomas H.
Pyo, Chul-Woo
Geraghty, Daniel
Jerome, Keith
Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy
title Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy
title_full Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy
title_fullStr Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy
title_full_unstemmed Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy
title_short Rapidly Identifying New Coronavirus Mutations of Potential Concern in the Omicron Variant Using an Unsupervised Learning Strategy
title_sort rapidly identifying new coronavirus mutations of potential concern in the omicron variant using an unsupervised learning strategy
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8887078/
https://www.ncbi.nlm.nih.gov/pubmed/35233566
http://dx.doi.org/10.21203/rs.3.rs-1280819/v1
work_keys_str_mv AT zhaolueping rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy
AT lybrandterry rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy
AT gilbertpeter rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy
AT paynethomash rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy
AT pyochulwoo rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy
AT geraghtydaniel rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy
AT jeromekeith rapidlyidentifyingnewcoronavirusmutationsofpotentialconcernintheomicronvariantusinganunsupervisedlearningstrategy