Cargando…

VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank

BACKGROUND: With sequencing technologies becoming cheaper and easier to use, more groups are able to obtain whole genome sequences of viruses of public health and scientific importance. Submission of genomic data to NCBI GenBank is a requirement prior to publication and plays a critical role in maki...

Descripción completa

Detalles Bibliográficos
Autores principales: Shean, Ryan C., Makhsous, Negar, Stoddard, Graham D., Lin, Michelle J., Greninger, Alexander L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343335/
https://www.ncbi.nlm.nih.gov/pubmed/30674273
http://dx.doi.org/10.1186/s12859-019-2606-y
_version_ 1783389266364071936
author Shean, Ryan C.
Makhsous, Negar
Stoddard, Graham D.
Lin, Michelle J.
Greninger, Alexander L.
author_facet Shean, Ryan C.
Makhsous, Negar
Stoddard, Graham D.
Lin, Michelle J.
Greninger, Alexander L.
author_sort Shean, Ryan C.
collection PubMed
description BACKGROUND: With sequencing technologies becoming cheaper and easier to use, more groups are able to obtain whole genome sequences of viruses of public health and scientific importance. Submission of genomic data to NCBI GenBank is a requirement prior to publication and plays a critical role in making scientific data publicly available. GenBank currently has automatic prokaryotic and eukaryotic genome annotation pipelines but has no viral annotation pipeline beyond influenza virus. Annotation and submission of viral genome sequence is a non-trivial task, especially for groups that do not routinely interact with GenBank for data submissions. RESULTS: We present Viral Annotation Pipeline and iDentification (VAPiD), a portable and lightweight command-line tool for annotation and GenBank deposition of viral genomes. VAPiD supports annotation of nearly all unsegmented viral genomes. The pipeline has been validated on human immunodeficiency virus, human parainfluenza virus 1–4, human metapneumovirus, human coronaviruses (229E/OC43/NL63/HKU1/SARS/MERS), human enteroviruses/rhinoviruses, measles virus, mumps virus, Hepatitis A-E Virus, Chikungunya virus, dengue virus, and West Nile virus, as well the human polyomaviruses BK/JC/MCV, human adenoviruses, and human papillomaviruses. The program can handle individual or batch submissions of different viruses to GenBank and correctly annotates multiple viruses, including those that contain ribosomal slippage or RNA editing without prior knowledge of the virus to be annotated. VAPiD is programmed in Python and is compatible with Windows, Linux, and Mac OS systems. CONCLUSIONS: We have created a portable, lightweight, user-friendly, internet-enabled, open-source, command-line genome annotation and submission package to facilitate virus genome submissions to NCBI GenBank. Instructions for downloading and installing VAPiD can be found at https://github.com/rcs333/VAPiD.
format Online
Article
Text
id pubmed-6343335
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63433352019-01-24 VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank Shean, Ryan C. Makhsous, Negar Stoddard, Graham D. Lin, Michelle J. Greninger, Alexander L. BMC Bioinformatics Software BACKGROUND: With sequencing technologies becoming cheaper and easier to use, more groups are able to obtain whole genome sequences of viruses of public health and scientific importance. Submission of genomic data to NCBI GenBank is a requirement prior to publication and plays a critical role in making scientific data publicly available. GenBank currently has automatic prokaryotic and eukaryotic genome annotation pipelines but has no viral annotation pipeline beyond influenza virus. Annotation and submission of viral genome sequence is a non-trivial task, especially for groups that do not routinely interact with GenBank for data submissions. RESULTS: We present Viral Annotation Pipeline and iDentification (VAPiD), a portable and lightweight command-line tool for annotation and GenBank deposition of viral genomes. VAPiD supports annotation of nearly all unsegmented viral genomes. The pipeline has been validated on human immunodeficiency virus, human parainfluenza virus 1–4, human metapneumovirus, human coronaviruses (229E/OC43/NL63/HKU1/SARS/MERS), human enteroviruses/rhinoviruses, measles virus, mumps virus, Hepatitis A-E Virus, Chikungunya virus, dengue virus, and West Nile virus, as well the human polyomaviruses BK/JC/MCV, human adenoviruses, and human papillomaviruses. The program can handle individual or batch submissions of different viruses to GenBank and correctly annotates multiple viruses, including those that contain ribosomal slippage or RNA editing without prior knowledge of the virus to be annotated. VAPiD is programmed in Python and is compatible with Windows, Linux, and Mac OS systems. CONCLUSIONS: We have created a portable, lightweight, user-friendly, internet-enabled, open-source, command-line genome annotation and submission package to facilitate virus genome submissions to NCBI GenBank. Instructions for downloading and installing VAPiD can be found at https://github.com/rcs333/VAPiD. BioMed Central 2019-01-23 /pmc/articles/PMC6343335/ /pubmed/30674273 http://dx.doi.org/10.1186/s12859-019-2606-y Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Shean, Ryan C.
Makhsous, Negar
Stoddard, Graham D.
Lin, Michelle J.
Greninger, Alexander L.
VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
title VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
title_full VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
title_fullStr VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
title_full_unstemmed VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
title_short VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
title_sort vapid: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to ncbi genbank
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343335/
https://www.ncbi.nlm.nih.gov/pubmed/30674273
http://dx.doi.org/10.1186/s12859-019-2606-y
work_keys_str_mv AT sheanryanc vapidalightweightcrossplatformviralannotationpipelineandidentificationtooltofacilitatevirusgenomesubmissionstoncbigenbank
AT makhsousnegar vapidalightweightcrossplatformviralannotationpipelineandidentificationtooltofacilitatevirusgenomesubmissionstoncbigenbank
AT stoddardgrahamd vapidalightweightcrossplatformviralannotationpipelineandidentificationtooltofacilitatevirusgenomesubmissionstoncbigenbank
AT linmichellej vapidalightweightcrossplatformviralannotationpipelineandidentificationtooltofacilitatevirusgenomesubmissionstoncbigenbank
AT greningeralexanderl vapidalightweightcrossplatformviralannotationpipelineandidentificationtooltofacilitatevirusgenomesubmissionstoncbigenbank