Cargando…

Measurements of intrahost viral diversity require an unbiased diversity metric

Viruses exist within hosts at large population sizes and are subject to high rates of mutation. As such, viral populations exhibit considerable sequence diversity. A variety of summary statistics have been developed which describe, in a single number, the extent of diversity in a viral population; s...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Lei, Illingworth, Christopher J R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354029/
https://www.ncbi.nlm.nih.gov/pubmed/30723551
http://dx.doi.org/10.1093/ve/vey041
_version_ 1783391090936643584
author Zhao, Lei
Illingworth, Christopher J R
author_facet Zhao, Lei
Illingworth, Christopher J R
author_sort Zhao, Lei
collection PubMed
description Viruses exist within hosts at large population sizes and are subject to high rates of mutation. As such, viral populations exhibit considerable sequence diversity. A variety of summary statistics have been developed which describe, in a single number, the extent of diversity in a viral population; such measurements allow the diversities of different populations to be compared, and the effect of evolutionary forces on a population to be assessed. Here we highlight statistical artefacts underlying some common measures of sequence diversity, whereby variation in the depth of genome sequencing may substantially affect the extent of diversity measured in a viral population, making comparisons of population diversity invalid. Specifically, naive estimation of sequence entropy provides a systematically biased metric, a lower read depth being expected to produce a lower estimate of diversity. The number of polymorphic loci per kilobase of genome is more unpredictably affected by read depth, giving potentially flawed results at lower sequencing depths. We show that the nucleotide diversity statistic π provides an unbiased estimate of diversity in the sense that the expected value of the statistic is equal to the correct value of the property being measured. Our results are of importance for studies interpreting genome sequence data; we describe how diversity may be assessed in viral populations in a fair and unbiased manner.
format Online
Article
Text
id pubmed-6354029
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63540292019-02-05 Measurements of intrahost viral diversity require an unbiased diversity metric Zhao, Lei Illingworth, Christopher J R Virus Evol Research Article Viruses exist within hosts at large population sizes and are subject to high rates of mutation. As such, viral populations exhibit considerable sequence diversity. A variety of summary statistics have been developed which describe, in a single number, the extent of diversity in a viral population; such measurements allow the diversities of different populations to be compared, and the effect of evolutionary forces on a population to be assessed. Here we highlight statistical artefacts underlying some common measures of sequence diversity, whereby variation in the depth of genome sequencing may substantially affect the extent of diversity measured in a viral population, making comparisons of population diversity invalid. Specifically, naive estimation of sequence entropy provides a systematically biased metric, a lower read depth being expected to produce a lower estimate of diversity. The number of polymorphic loci per kilobase of genome is more unpredictably affected by read depth, giving potentially flawed results at lower sequencing depths. We show that the nucleotide diversity statistic π provides an unbiased estimate of diversity in the sense that the expected value of the statistic is equal to the correct value of the property being measured. Our results are of importance for studies interpreting genome sequence data; we describe how diversity may be assessed in viral populations in a fair and unbiased manner. Oxford University Press 2019-01-30 /pmc/articles/PMC6354029/ /pubmed/30723551 http://dx.doi.org/10.1093/ve/vey041 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhao, Lei
Illingworth, Christopher J R
Measurements of intrahost viral diversity require an unbiased diversity metric
title Measurements of intrahost viral diversity require an unbiased diversity metric
title_full Measurements of intrahost viral diversity require an unbiased diversity metric
title_fullStr Measurements of intrahost viral diversity require an unbiased diversity metric
title_full_unstemmed Measurements of intrahost viral diversity require an unbiased diversity metric
title_short Measurements of intrahost viral diversity require an unbiased diversity metric
title_sort measurements of intrahost viral diversity require an unbiased diversity metric
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354029/
https://www.ncbi.nlm.nih.gov/pubmed/30723551
http://dx.doi.org/10.1093/ve/vey041
work_keys_str_mv AT zhaolei measurementsofintrahostviraldiversityrequireanunbiaseddiversitymetric
AT illingworthchristopherjr measurementsofintrahostviraldiversityrequireanunbiaseddiversitymetric