Cargando…

On the effective depth of viral sequence data

Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity...

Descripción completa

Detalles Bibliográficos
Autores principales: Illingworth, Christopher J R, Roy, Sunando, Beale, Mathew A, Tutill, Helena, Williams, Rachel, Breuer, Judith
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5724399/
https://www.ncbi.nlm.nih.gov/pubmed/29250429
http://dx.doi.org/10.1093/ve/vex030
_version_ 1783285349566382080
author Illingworth, Christopher J R
Roy, Sunando
Beale, Mathew A
Tutill, Helena
Williams, Rachel
Breuer, Judith
author_facet Illingworth, Christopher J R
Roy, Sunando
Beale, Mathew A
Tutill, Helena
Williams, Rachel
Breuer, Judith
author_sort Illingworth, Christopher J R
collection PubMed
description Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity and nature of the sample collected, and the subsequent steps in viral processing. To investigate this phenomenon, we sequenced replica datasets spanning a range of viruses, and in which the point at which samples were split was different in each case, from a dataset in which independent samples were collected from a single patient to another in which all processing steps up to sequencing were applied to a single sample before splitting the sample and sequencing each replicate. We conclude that neither a high read depth nor a high template number in a sample guarantee the precision of a dataset. Measures of consistency calculated from within a single biological sample may also be insufficient; distortion of the composition of a population by the experimental procedure or genuine within-host diversity between samples may each affect the results. Where it is possible, data from replicate samples should be collected to validate the consistency of short-read sequence data.
format Online
Article
Text
id pubmed-5724399
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-57243992017-12-15 On the effective depth of viral sequence data Illingworth, Christopher J R Roy, Sunando Beale, Mathew A Tutill, Helena Williams, Rachel Breuer, Judith Virus Evol Research Article Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity and nature of the sample collected, and the subsequent steps in viral processing. To investigate this phenomenon, we sequenced replica datasets spanning a range of viruses, and in which the point at which samples were split was different in each case, from a dataset in which independent samples were collected from a single patient to another in which all processing steps up to sequencing were applied to a single sample before splitting the sample and sequencing each replicate. We conclude that neither a high read depth nor a high template number in a sample guarantee the precision of a dataset. Measures of consistency calculated from within a single biological sample may also be insufficient; distortion of the composition of a population by the experimental procedure or genuine within-host diversity between samples may each affect the results. Where it is possible, data from replicate samples should be collected to validate the consistency of short-read sequence data. Oxford University Press 2017-11-14 /pmc/articles/PMC5724399/ /pubmed/29250429 http://dx.doi.org/10.1093/ve/vex030 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Illingworth, Christopher J R
Roy, Sunando
Beale, Mathew A
Tutill, Helena
Williams, Rachel
Breuer, Judith
On the effective depth of viral sequence data
title On the effective depth of viral sequence data
title_full On the effective depth of viral sequence data
title_fullStr On the effective depth of viral sequence data
title_full_unstemmed On the effective depth of viral sequence data
title_short On the effective depth of viral sequence data
title_sort on the effective depth of viral sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5724399/
https://www.ncbi.nlm.nih.gov/pubmed/29250429
http://dx.doi.org/10.1093/ve/vex030
work_keys_str_mv AT illingworthchristopherjr ontheeffectivedepthofviralsequencedata
AT roysunando ontheeffectivedepthofviralsequencedata
AT bealemathewa ontheeffectivedepthofviralsequencedata
AT tutillhelena ontheeffectivedepthofviralsequencedata
AT williamsrachel ontheeffectivedepthofviralsequencedata
AT breuerjudith ontheeffectivedepthofviralsequencedata