Cargando…

Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data

Next-generation sequencing (NGS) technologies have made it possible to address population genetic questions in almost any system, but high error rates associated with such data can introduce significant biases into downstream analyses, necessitating careful experimental design and interpretation in...

Descripción completa

Detalles Bibliográficos
Autores principales: Crawford, Jacob E., Lazzaro, Brian P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Research Foundation 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3334522/
https://www.ncbi.nlm.nih.gov/pubmed/22536207
http://dx.doi.org/10.3389/fgene.2012.00066
_version_ 1782230634111959040
author Crawford, Jacob E.
Lazzaro, Brian P.
author_facet Crawford, Jacob E.
Lazzaro, Brian P.
author_sort Crawford, Jacob E.
collection PubMed
description Next-generation sequencing (NGS) technologies have made it possible to address population genetic questions in almost any system, but high error rates associated with such data can introduce significant biases into downstream analyses, necessitating careful experimental design and interpretation in studies based on short-read sequencing. Exploration of population genetic analyses based on NGS has revealed some of the potential biases, but previous work has emphasized parameters relevant to human population genetics and further examination of parameters relevant to other systems is necessary, including situations where sample sizes are small and genetic variation is high. To assess experimental power to address several principal objectives of population genetic studies under these conditions, we simulated population samples under selective sweep, population growth, and population subdivision models and tested the power to accurately infer population genetic parameters from sequence polymorphism data obtained through simulated 4×, 8×, and 15× read depth sequence data. We found that estimates of population genetic differentiation and population growth parameters were systematically biased when inference was based on 4× sequencing, but biases were markedly reduced at even 8× read depth. We also found that the power to identify footprints of positive selection depends on an interaction between read depth and the strength of selection, with strong selection being recovered consistently at all read depths, but weak selection requiring deeper read depths for reliable detection. Although we have explored only a small subset of the many possible experimental designs and population genetic models, using only one SNP-calling approach, our results reveal some general patterns and provide some assessment of what biases could be expected under similar experimental structures.
format Online
Article
Text
id pubmed-3334522
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Frontiers Research Foundation
record_format MEDLINE/PubMed
spelling pubmed-33345222012-04-25 Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data Crawford, Jacob E. Lazzaro, Brian P. Front Genet Genetics Next-generation sequencing (NGS) technologies have made it possible to address population genetic questions in almost any system, but high error rates associated with such data can introduce significant biases into downstream analyses, necessitating careful experimental design and interpretation in studies based on short-read sequencing. Exploration of population genetic analyses based on NGS has revealed some of the potential biases, but previous work has emphasized parameters relevant to human population genetics and further examination of parameters relevant to other systems is necessary, including situations where sample sizes are small and genetic variation is high. To assess experimental power to address several principal objectives of population genetic studies under these conditions, we simulated population samples under selective sweep, population growth, and population subdivision models and tested the power to accurately infer population genetic parameters from sequence polymorphism data obtained through simulated 4×, 8×, and 15× read depth sequence data. We found that estimates of population genetic differentiation and population growth parameters were systematically biased when inference was based on 4× sequencing, but biases were markedly reduced at even 8× read depth. We also found that the power to identify footprints of positive selection depends on an interaction between read depth and the strength of selection, with strong selection being recovered consistently at all read depths, but weak selection requiring deeper read depths for reliable detection. Although we have explored only a small subset of the many possible experimental designs and population genetic models, using only one SNP-calling approach, our results reveal some general patterns and provide some assessment of what biases could be expected under similar experimental structures. Frontiers Research Foundation 2012-04-24 /pmc/articles/PMC3334522/ /pubmed/22536207 http://dx.doi.org/10.3389/fgene.2012.00066 Text en Copyright © 2012 Crawford and Lazzaro. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
spellingShingle Genetics
Crawford, Jacob E.
Lazzaro, Brian P.
Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data
title Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data
title_full Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data
title_fullStr Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data
title_full_unstemmed Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data
title_short Assessing the Accuracy and Power of Population Genetic Inference from Low-Pass Next-Generation Sequencing Data
title_sort assessing the accuracy and power of population genetic inference from low-pass next-generation sequencing data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3334522/
https://www.ncbi.nlm.nih.gov/pubmed/22536207
http://dx.doi.org/10.3389/fgene.2012.00066
work_keys_str_mv AT crawfordjacobe assessingtheaccuracyandpowerofpopulationgeneticinferencefromlowpassnextgenerationsequencingdata
AT lazzarobrianp assessingtheaccuracyandpowerofpopulationgeneticinferencefromlowpassnextgenerationsequencingdata