Cargando…
Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data
The use of next generation sequencing (NGS) to identify novel viral sequences from eukaryotic tissue samples is challenging. Issues can include the low proportion and copy number of viral reads and the high number of contigs (post-assembly), making subsequent viral analysis difficult. Comparison of...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4476701/ https://www.ncbi.nlm.nih.gov/pubmed/26098299 http://dx.doi.org/10.1371/journal.pone.0129059 |
_version_ | 1782377636867080192 |
---|---|
author | Daly, Gordon M. Leggett, Richard M. Rowe, William Stubbs, Samuel Wilkinson, Maxim Ramirez-Gonzalez, Ricardo H. Caccamo, Mario Bernal, William Heeney, Jonathan L. |
author_facet | Daly, Gordon M. Leggett, Richard M. Rowe, William Stubbs, Samuel Wilkinson, Maxim Ramirez-Gonzalez, Ricardo H. Caccamo, Mario Bernal, William Heeney, Jonathan L. |
author_sort | Daly, Gordon M. |
collection | PubMed |
description | The use of next generation sequencing (NGS) to identify novel viral sequences from eukaryotic tissue samples is challenging. Issues can include the low proportion and copy number of viral reads and the high number of contigs (post-assembly), making subsequent viral analysis difficult. Comparison of assembly algorithms with pre-assembly host-mapping subtraction using a short-read mapping tool, a k-mer frequency based filter and a low complexity filter, has been validated for viral discovery with Illumina data derived from naturally infected liver tissue and simulated data. Assembled contig numbers were significantly reduced (up to 99.97%) by the application of these pre-assembly filtering methods. This approach provides a validated method for maximizing viral contig size as well as reducing the total number of assembled contigs that require down-stream analysis as putative viral nucleic acids. |
format | Online Article Text |
id | pubmed-4476701 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-44767012015-06-25 Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data Daly, Gordon M. Leggett, Richard M. Rowe, William Stubbs, Samuel Wilkinson, Maxim Ramirez-Gonzalez, Ricardo H. Caccamo, Mario Bernal, William Heeney, Jonathan L. PLoS One Research Article The use of next generation sequencing (NGS) to identify novel viral sequences from eukaryotic tissue samples is challenging. Issues can include the low proportion and copy number of viral reads and the high number of contigs (post-assembly), making subsequent viral analysis difficult. Comparison of assembly algorithms with pre-assembly host-mapping subtraction using a short-read mapping tool, a k-mer frequency based filter and a low complexity filter, has been validated for viral discovery with Illumina data derived from naturally infected liver tissue and simulated data. Assembled contig numbers were significantly reduced (up to 99.97%) by the application of these pre-assembly filtering methods. This approach provides a validated method for maximizing viral contig size as well as reducing the total number of assembled contigs that require down-stream analysis as putative viral nucleic acids. Public Library of Science 2015-06-22 /pmc/articles/PMC4476701/ /pubmed/26098299 http://dx.doi.org/10.1371/journal.pone.0129059 Text en © 2015 Daly et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Daly, Gordon M. Leggett, Richard M. Rowe, William Stubbs, Samuel Wilkinson, Maxim Ramirez-Gonzalez, Ricardo H. Caccamo, Mario Bernal, William Heeney, Jonathan L. Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data |
title | Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data |
title_full | Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data |
title_fullStr | Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data |
title_full_unstemmed | Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data |
title_short | Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data |
title_sort | host subtraction, filtering and assembly validations for novel viral discovery using next generation sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4476701/ https://www.ncbi.nlm.nih.gov/pubmed/26098299 http://dx.doi.org/10.1371/journal.pone.0129059 |
work_keys_str_mv | AT dalygordonm hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT leggettrichardm hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT rowewilliam hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT stubbssamuel hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT wilkinsonmaxim hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT ramirezgonzalezricardoh hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT caccamomario hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT bernalwilliam hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata AT heeneyjonathanl hostsubtractionfilteringandassemblyvalidationsfornovelviraldiscoveryusingnextgenerationsequencingdata |