Cargando…
On the effective depth of viral sequence data
Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5724399/ https://www.ncbi.nlm.nih.gov/pubmed/29250429 http://dx.doi.org/10.1093/ve/vex030 |
_version_ | 1783285349566382080 |
---|---|
author | Illingworth, Christopher J R Roy, Sunando Beale, Mathew A Tutill, Helena Williams, Rachel Breuer, Judith |
author_facet | Illingworth, Christopher J R Roy, Sunando Beale, Mathew A Tutill, Helena Williams, Rachel Breuer, Judith |
author_sort | Illingworth, Christopher J R |
collection | PubMed |
description | Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity and nature of the sample collected, and the subsequent steps in viral processing. To investigate this phenomenon, we sequenced replica datasets spanning a range of viruses, and in which the point at which samples were split was different in each case, from a dataset in which independent samples were collected from a single patient to another in which all processing steps up to sequencing were applied to a single sample before splitting the sample and sequencing each replicate. We conclude that neither a high read depth nor a high template number in a sample guarantee the precision of a dataset. Measures of consistency calculated from within a single biological sample may also be insufficient; distortion of the composition of a population by the experimental procedure or genuine within-host diversity between samples may each affect the results. Where it is possible, data from replicate samples should be collected to validate the consistency of short-read sequence data. |
format | Online Article Text |
id | pubmed-5724399 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-57243992017-12-15 On the effective depth of viral sequence data Illingworth, Christopher J R Roy, Sunando Beale, Mathew A Tutill, Helena Williams, Rachel Breuer, Judith Virus Evol Research Article Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity and nature of the sample collected, and the subsequent steps in viral processing. To investigate this phenomenon, we sequenced replica datasets spanning a range of viruses, and in which the point at which samples were split was different in each case, from a dataset in which independent samples were collected from a single patient to another in which all processing steps up to sequencing were applied to a single sample before splitting the sample and sequencing each replicate. We conclude that neither a high read depth nor a high template number in a sample guarantee the precision of a dataset. Measures of consistency calculated from within a single biological sample may also be insufficient; distortion of the composition of a population by the experimental procedure or genuine within-host diversity between samples may each affect the results. Where it is possible, data from replicate samples should be collected to validate the consistency of short-read sequence data. Oxford University Press 2017-11-14 /pmc/articles/PMC5724399/ /pubmed/29250429 http://dx.doi.org/10.1093/ve/vex030 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Illingworth, Christopher J R Roy, Sunando Beale, Mathew A Tutill, Helena Williams, Rachel Breuer, Judith On the effective depth of viral sequence data |
title | On the effective depth of viral sequence data |
title_full | On the effective depth of viral sequence data |
title_fullStr | On the effective depth of viral sequence data |
title_full_unstemmed | On the effective depth of viral sequence data |
title_short | On the effective depth of viral sequence data |
title_sort | on the effective depth of viral sequence data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5724399/ https://www.ncbi.nlm.nih.gov/pubmed/29250429 http://dx.doi.org/10.1093/ve/vex030 |
work_keys_str_mv | AT illingworthchristopherjr ontheeffectivedepthofviralsequencedata AT roysunando ontheeffectivedepthofviralsequencedata AT bealemathewa ontheeffectivedepthofviralsequencedata AT tutillhelena ontheeffectivedepthofviralsequencedata AT williamsrachel ontheeffectivedepthofviralsequencedata AT breuerjudith ontheeffectivedepthofviralsequencedata |