Cargando…

CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study

BACKGROUND: COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with these updates, scientists need to frequently refresh and reclean data sets, which is an ad hoc and la...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Boxiang, Liu, Kaibo, Zhang, He, Zhang, Liang, Bian, Yuchen, Huang, Liang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7537720/
https://www.ncbi.nlm.nih.gov/pubmed/32931441
http://dx.doi.org/10.2196/22299
_version_ 1783590721718059008
author Liu, Boxiang
Liu, Kaibo
Zhang, He
Zhang, Liang
Bian, Yuchen
Huang, Liang
author_facet Liu, Boxiang
Liu, Kaibo
Zhang, He
Zhang, Liang
Bian, Yuchen
Huang, Liang
author_sort Liu, Boxiang
collection PubMed
description BACKGROUND: COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with these updates, scientists need to frequently refresh and reclean data sets, which is an ad hoc and labor-intensive process. Further, scientists with limited bioinformatics or programming knowledge may find it difficult to analyze SARS-CoV-2 genomes. OBJECTIVE: To address these challenges, we developed CoV-Seq, an integrated web server that enables simple and rapid analysis of SARS-CoV-2 genomes. METHODS: CoV-Seq is implemented in Python and JavaScript. The web server and source code URLs are provided in this article. RESULTS: Given a new sequence, CoV-Seq automatically predicts gene boundaries and identifies genetic variants, which are displayed in an interactive genome visualizer and are downloadable for further analysis. A command-line interface is available for high-throughput processing. In addition, we aggregated all publicly available SARS-CoV-2 sequences from the Global Initiative on Sharing Avian Influenza Data (GISAID), National Center for Biotechnology Information (NCBI), European Nucleotide Archive (ENA), and China National GeneBank (CNGB), and extracted genetic variants from these sequences for download and downstream analysis. The CoV-Seq database is updated weekly. CONCLUSIONS: We have developed CoV-Seq, an integrated web service for fast and easy analysis of custom SARS-CoV-2 sequences. The web server provides an interactive module for the analysis of custom sequences and a weekly updated database of genetic variants of all publicly accessible SARS-CoV-2 sequences. We believe CoV-Seq will help improve our understanding of the genetic underpinnings of COVID-19.
format Online
Article
Text
id pubmed-7537720
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-75377202020-10-20 CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study Liu, Boxiang Liu, Kaibo Zhang, He Zhang, Liang Bian, Yuchen Huang, Liang J Med Internet Res Original Paper BACKGROUND: COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with these updates, scientists need to frequently refresh and reclean data sets, which is an ad hoc and labor-intensive process. Further, scientists with limited bioinformatics or programming knowledge may find it difficult to analyze SARS-CoV-2 genomes. OBJECTIVE: To address these challenges, we developed CoV-Seq, an integrated web server that enables simple and rapid analysis of SARS-CoV-2 genomes. METHODS: CoV-Seq is implemented in Python and JavaScript. The web server and source code URLs are provided in this article. RESULTS: Given a new sequence, CoV-Seq automatically predicts gene boundaries and identifies genetic variants, which are displayed in an interactive genome visualizer and are downloadable for further analysis. A command-line interface is available for high-throughput processing. In addition, we aggregated all publicly available SARS-CoV-2 sequences from the Global Initiative on Sharing Avian Influenza Data (GISAID), National Center for Biotechnology Information (NCBI), European Nucleotide Archive (ENA), and China National GeneBank (CNGB), and extracted genetic variants from these sequences for download and downstream analysis. The CoV-Seq database is updated weekly. CONCLUSIONS: We have developed CoV-Seq, an integrated web service for fast and easy analysis of custom SARS-CoV-2 sequences. The web server provides an interactive module for the analysis of custom sequences and a weekly updated database of genetic variants of all publicly accessible SARS-CoV-2 sequences. We believe CoV-Seq will help improve our understanding of the genetic underpinnings of COVID-19. JMIR Publications 2020-10-02 /pmc/articles/PMC7537720/ /pubmed/32931441 http://dx.doi.org/10.2196/22299 Text en ©Boxiang Liu, Kaibo Liu, He Zhang, Liang Zhang, Yuchen Bian, Liang Huang. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 02.10.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Liu, Boxiang
Liu, Kaibo
Zhang, He
Zhang, Liang
Bian, Yuchen
Huang, Liang
CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
title CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
title_full CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
title_fullStr CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
title_full_unstemmed CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
title_short CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
title_sort cov-seq, a new tool for sars-cov-2 genome analysis and visualization: development and usability study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7537720/
https://www.ncbi.nlm.nih.gov/pubmed/32931441
http://dx.doi.org/10.2196/22299
work_keys_str_mv AT liuboxiang covseqanewtoolforsarscov2genomeanalysisandvisualizationdevelopmentandusabilitystudy
AT liukaibo covseqanewtoolforsarscov2genomeanalysisandvisualizationdevelopmentandusabilitystudy
AT zhanghe covseqanewtoolforsarscov2genomeanalysisandvisualizationdevelopmentandusabilitystudy
AT zhangliang covseqanewtoolforsarscov2genomeanalysisandvisualizationdevelopmentandusabilitystudy
AT bianyuchen covseqanewtoolforsarscov2genomeanalysisandvisualizationdevelopmentandusabilitystudy
AT huangliang covseqanewtoolforsarscov2genomeanalysisandvisualizationdevelopmentandusabilitystudy