Cargando…
Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression
We analyze the whole genome phylogeny and taxonomy of the SARS-CoV-2 virus using compression. This is a new fast alignment-free method called the “normalized compression distance” (NCD) method. It discovers all effective similarities based on Kolmogorov complexity. The latter being incomputable we a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8132223/ https://www.ncbi.nlm.nih.gov/pubmed/34013267 http://dx.doi.org/10.1101/2020.07.22.216242 |
_version_ | 1783694874018578432 |
---|---|
author | Cilibrasi, Rudi L. Vitányi, Paul M.B. |
author_facet | Cilibrasi, Rudi L. Vitányi, Paul M.B. |
author_sort | Cilibrasi, Rudi L. |
collection | PubMed |
description | We analyze the whole genome phylogeny and taxonomy of the SARS-CoV-2 virus using compression. This is a new fast alignment-free method called the “normalized compression distance” (NCD) method. It discovers all effective similarities based on Kolmogorov complexity. The latter being incomputable we approximate it by a good compressor such as the modern zpaq. The results comprise that the SARS-CoV-2 virus is closest to the RaTG13 virus and similar to two bat SARS-like coronaviruses bat-SL-CoVZXC21 and bat-SL-CoVZC4. The similarity is quantified and compared with the same quantified similarities among the mtDNA of certain species. We treat the question whether Pangolins are involved in the SARS-CoV-2 virus. The compression method is simpler and possibly faster than any other whole genome method, which makes it the ideal tool to explore phylogeny. |
format | Online Article Text |
id | pubmed-8132223 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-81322232021-05-20 Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression Cilibrasi, Rudi L. Vitányi, Paul M.B. bioRxiv Article We analyze the whole genome phylogeny and taxonomy of the SARS-CoV-2 virus using compression. This is a new fast alignment-free method called the “normalized compression distance” (NCD) method. It discovers all effective similarities based on Kolmogorov complexity. The latter being incomputable we approximate it by a good compressor such as the modern zpaq. The results comprise that the SARS-CoV-2 virus is closest to the RaTG13 virus and similar to two bat SARS-like coronaviruses bat-SL-CoVZXC21 and bat-SL-CoVZC4. The similarity is quantified and compared with the same quantified similarities among the mtDNA of certain species. We treat the question whether Pangolins are involved in the SARS-CoV-2 virus. The compression method is simpler and possibly faster than any other whole genome method, which makes it the ideal tool to explore phylogeny. Cold Spring Harbor Laboratory 2020-08-26 /pmc/articles/PMC8132223/ /pubmed/34013267 http://dx.doi.org/10.1101/2020.07.22.216242 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Cilibrasi, Rudi L. Vitányi, Paul M.B. Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression |
title |
Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression
|
title_full |
Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression
|
title_fullStr |
Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression
|
title_full_unstemmed |
Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression
|
title_short |
Fast Whole-Genome Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression
|
title_sort | fast whole-genome phylogeny of the covid-19 virus sars-cov-2 by compression |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8132223/ https://www.ncbi.nlm.nih.gov/pubmed/34013267 http://dx.doi.org/10.1101/2020.07.22.216242 |
work_keys_str_mv | AT cilibrasirudil fastwholegenomephylogenyofthecovid19virussarscov2bycompression AT vitanyipaulmb fastwholegenomephylogenyofthecovid19virussarscov2bycompression |