Cargando…

Finishing monkeypox genomes from short reads: assembly analysis and a neural network method

BACKGROUND: Poxviruses constitute one of the largest and most complex animal virus families known. The notorious smallpox disease has been eradicated and the virus contained, but its simian sister, monkeypox is an emerging, untreatable infectious disease, killing 1 to 10 % of its human victims. In t...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Kun, Wohlhueter, Robert M., Li, Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009526/
https://www.ncbi.nlm.nih.gov/pubmed/27585810
http://dx.doi.org/10.1186/s12864-016-2826-8
_version_ 1782451528789917696
author Zhao, Kun
Wohlhueter, Robert M.
Li, Yu
author_facet Zhao, Kun
Wohlhueter, Robert M.
Li, Yu
author_sort Zhao, Kun
collection PubMed
description BACKGROUND: Poxviruses constitute one of the largest and most complex animal virus families known. The notorious smallpox disease has been eradicated and the virus contained, but its simian sister, monkeypox is an emerging, untreatable infectious disease, killing 1 to 10 % of its human victims. In the case of poxviruses, the emergence of monkeypox outbreaks in humans and the need to monitor potential malicious release of smallpox virus requires development of methods for rapid virus identification. Whole-genome sequencing (WGS) is an emergent technology with increasing application to the diagnosis of diseases and the identification of outbreak pathogens. But “finishing” such a genome is a laborious and time-consuming process, not easily automated. To date the large, complete poxvirus genomes have not been studied comprehensively in terms of applying WGS techniques and evaluating genome assembly algorithms. RESULTS: To explore the limitations to finishing a poxvirus genome from short reads, we first analyze the repetitive regions in a monkeypox genome and evaluate genome assembly on the simulated reads. We also report on procedures and insights relevant to the assembly (from realistically short reads) of genomes. Finally, we propose a neural network method (namely Neural-KSP) to “finish” the process by closing gaps remaining after conventional assembly, as the final stage in a protocol to elucidate clinical poxvirus genomic sequences. CONCLUSIONS: The protocol may prove useful in any clinical viral isolate (regardless if a reference-strain sequence is available) and especially useful in genomes confounded by many global and local repetitive sequences embedded in them. This work highlights the feasibility of finishing real, complex genomes by systematically analyzing genetic characteristics, thus remedying existing assembly shortcomings with a neural network method. Such finished sequences may enable clinicians to track genetic distance between viral isolates that provides a powerful epidemiological tool.
format Online
Article
Text
id pubmed-5009526
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50095262016-09-08 Finishing monkeypox genomes from short reads: assembly analysis and a neural network method Zhao, Kun Wohlhueter, Robert M. Li, Yu BMC Genomics Research BACKGROUND: Poxviruses constitute one of the largest and most complex animal virus families known. The notorious smallpox disease has been eradicated and the virus contained, but its simian sister, monkeypox is an emerging, untreatable infectious disease, killing 1 to 10 % of its human victims. In the case of poxviruses, the emergence of monkeypox outbreaks in humans and the need to monitor potential malicious release of smallpox virus requires development of methods for rapid virus identification. Whole-genome sequencing (WGS) is an emergent technology with increasing application to the diagnosis of diseases and the identification of outbreak pathogens. But “finishing” such a genome is a laborious and time-consuming process, not easily automated. To date the large, complete poxvirus genomes have not been studied comprehensively in terms of applying WGS techniques and evaluating genome assembly algorithms. RESULTS: To explore the limitations to finishing a poxvirus genome from short reads, we first analyze the repetitive regions in a monkeypox genome and evaluate genome assembly on the simulated reads. We also report on procedures and insights relevant to the assembly (from realistically short reads) of genomes. Finally, we propose a neural network method (namely Neural-KSP) to “finish” the process by closing gaps remaining after conventional assembly, as the final stage in a protocol to elucidate clinical poxvirus genomic sequences. CONCLUSIONS: The protocol may prove useful in any clinical viral isolate (regardless if a reference-strain sequence is available) and especially useful in genomes confounded by many global and local repetitive sequences embedded in them. This work highlights the feasibility of finishing real, complex genomes by systematically analyzing genetic characteristics, thus remedying existing assembly shortcomings with a neural network method. Such finished sequences may enable clinicians to track genetic distance between viral isolates that provides a powerful epidemiological tool. BioMed Central 2016-08-31 /pmc/articles/PMC5009526/ /pubmed/27585810 http://dx.doi.org/10.1186/s12864-016-2826-8 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Zhao, Kun
Wohlhueter, Robert M.
Li, Yu
Finishing monkeypox genomes from short reads: assembly analysis and a neural network method
title Finishing monkeypox genomes from short reads: assembly analysis and a neural network method
title_full Finishing monkeypox genomes from short reads: assembly analysis and a neural network method
title_fullStr Finishing monkeypox genomes from short reads: assembly analysis and a neural network method
title_full_unstemmed Finishing monkeypox genomes from short reads: assembly analysis and a neural network method
title_short Finishing monkeypox genomes from short reads: assembly analysis and a neural network method
title_sort finishing monkeypox genomes from short reads: assembly analysis and a neural network method
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009526/
https://www.ncbi.nlm.nih.gov/pubmed/27585810
http://dx.doi.org/10.1186/s12864-016-2826-8
work_keys_str_mv AT zhaokun finishingmonkeypoxgenomesfromshortreadsassemblyanalysisandaneuralnetworkmethod
AT wohlhueterrobertm finishingmonkeypoxgenomesfromshortreadsassemblyanalysisandaneuralnetworkmethod
AT liyu finishingmonkeypoxgenomesfromshortreadsassemblyanalysisandaneuralnetworkmethod