Cargando…
Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection
High-throughput sequencing (HTS) has demonstrated capabilities for broad virus detection based upon discovery of known and novel viruses in a variety of samples, including clinical, environmental, and biological. An important goal for HTS applications in biologics is to establish parameter settings...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6213042/ https://www.ncbi.nlm.nih.gov/pubmed/30262776 http://dx.doi.org/10.3390/v10100528 |
_version_ | 1783367680663748608 |
---|---|
author | Lambert, Christophe Braxton, Cassandra Charlebois, Robert L. Deyati, Avisek Duncan, Paul La Neve, Fabio Malicki, Heather D. Ribrioux, Sebastien Rozelle, Daniel K. Michaels, Brandye Sun, Wenping Yang, Zhihui Khan, Arifa S. |
author_facet | Lambert, Christophe Braxton, Cassandra Charlebois, Robert L. Deyati, Avisek Duncan, Paul La Neve, Fabio Malicki, Heather D. Ribrioux, Sebastien Rozelle, Daniel K. Michaels, Brandye Sun, Wenping Yang, Zhihui Khan, Arifa S. |
author_sort | Lambert, Christophe |
collection | PubMed |
description | High-throughput sequencing (HTS) has demonstrated capabilities for broad virus detection based upon discovery of known and novel viruses in a variety of samples, including clinical, environmental, and biological. An important goal for HTS applications in biologics is to establish parameter settings that can afford adequate sensitivity at an acceptable computational cost (computation time, computer memory, storage, expense or/and efficiency), at critical steps in the bioinformatics pipeline, including initial data quality assessment, trimming/cleaning, and assembly (to reduce data volume and increase likelihood of appropriate sequence identification). Additionally, the quality and reliability of the results depend on the availability of a complete and curated viral database for obtaining accurate results; selection of sequence alignment programs and their configuration, that retains specificity for broad virus detection with reduced false-positive signals; removal of host sequences without loss of endogenous viral sequences of interest; and use of a meaningful reporting format, which can retain critical information of the analysis for presentation of readily interpretable data and actionable results. Furthermore, after alignment, both automated and manual evaluation may be needed to verify the results and help assign a potential risk level to residual, unmapped reads. We hope that the collective considerations discussed in this paper aid toward optimization of data analysis pipelines for virus detection by HTS. |
format | Online Article Text |
id | pubmed-6213042 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-62130422018-11-09 Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection Lambert, Christophe Braxton, Cassandra Charlebois, Robert L. Deyati, Avisek Duncan, Paul La Neve, Fabio Malicki, Heather D. Ribrioux, Sebastien Rozelle, Daniel K. Michaels, Brandye Sun, Wenping Yang, Zhihui Khan, Arifa S. Viruses Perspective High-throughput sequencing (HTS) has demonstrated capabilities for broad virus detection based upon discovery of known and novel viruses in a variety of samples, including clinical, environmental, and biological. An important goal for HTS applications in biologics is to establish parameter settings that can afford adequate sensitivity at an acceptable computational cost (computation time, computer memory, storage, expense or/and efficiency), at critical steps in the bioinformatics pipeline, including initial data quality assessment, trimming/cleaning, and assembly (to reduce data volume and increase likelihood of appropriate sequence identification). Additionally, the quality and reliability of the results depend on the availability of a complete and curated viral database for obtaining accurate results; selection of sequence alignment programs and their configuration, that retains specificity for broad virus detection with reduced false-positive signals; removal of host sequences without loss of endogenous viral sequences of interest; and use of a meaningful reporting format, which can retain critical information of the analysis for presentation of readily interpretable data and actionable results. Furthermore, after alignment, both automated and manual evaluation may be needed to verify the results and help assign a potential risk level to residual, unmapped reads. We hope that the collective considerations discussed in this paper aid toward optimization of data analysis pipelines for virus detection by HTS. MDPI 2018-09-27 /pmc/articles/PMC6213042/ /pubmed/30262776 http://dx.doi.org/10.3390/v10100528 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Perspective Lambert, Christophe Braxton, Cassandra Charlebois, Robert L. Deyati, Avisek Duncan, Paul La Neve, Fabio Malicki, Heather D. Ribrioux, Sebastien Rozelle, Daniel K. Michaels, Brandye Sun, Wenping Yang, Zhihui Khan, Arifa S. Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection |
title | Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection |
title_full | Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection |
title_fullStr | Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection |
title_full_unstemmed | Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection |
title_short | Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection |
title_sort | considerations for optimization of high-throughput sequencing bioinformatics pipelines for virus detection |
topic | Perspective |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6213042/ https://www.ncbi.nlm.nih.gov/pubmed/30262776 http://dx.doi.org/10.3390/v10100528 |
work_keys_str_mv | AT lambertchristophe considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT braxtoncassandra considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT charleboisrobertl considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT deyatiavisek considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT duncanpaul considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT lanevefabio considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT malickiheatherd considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT ribriouxsebastien considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT rozelledanielk considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT michaelsbrandye considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT sunwenping considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT yangzhihui considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection AT khanarifas considerationsforoptimizationofhighthroughputsequencingbioinformaticspipelinesforvirusdetection |