Cargando…
Optimal sequencing strategies for identifying disease-associated singletons
With the increasing focus of genetic association on the identification of trait-associated rare variants through sequencing, it is important to identify the most cost-effective sequencing strategies for these studies. Deep sequencing will accurately detect and genotype the most rare variants per ind...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5501675/ https://www.ncbi.nlm.nih.gov/pubmed/28640830 http://dx.doi.org/10.1371/journal.pgen.1006811 |
_version_ | 1783248834685566976 |
---|---|
author | Rashkin, Sara Jun, Goo Chen, Sai Abecasis, Goncalo R. |
author_facet | Rashkin, Sara Jun, Goo Chen, Sai Abecasis, Goncalo R. |
author_sort | Rashkin, Sara |
collection | PubMed |
description | With the increasing focus of genetic association on the identification of trait-associated rare variants through sequencing, it is important to identify the most cost-effective sequencing strategies for these studies. Deep sequencing will accurately detect and genotype the most rare variants per individual, but may limit sample size. Low pass sequencing will miss some variants in each individual but has been shown to provide a cost-effective alternative for studies of common variants. Here, we investigate the impact of sequencing depth on studies of rare variants, focusing on singletons—the variants that are sampled in a single individual and are hardest to detect at low sequencing depths. We first estimate the sensitivity to detect singleton variants in both simulated data and in down-sampled deep genome and exome sequence data. We then explore the power of association studies comparing burden of singleton variants in cases and controls under a variety of conditions. We show that the power to detect singletons increases with coverage, typically plateauing for coverage > ~25x. Next, we show that, when total sequencing capacity is fixed, the power of association studies focused on singletons is typically maximized for coverage of 15-20x, independent of relative risk, disease prevalence, singleton burden, and case-control ratio. Our results suggest sequencing depth of 15-20x as an appropriate compromise of singleton detection power and sample size for studies of rare variants in complex disease. |
format | Online Article Text |
id | pubmed-5501675 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-55016752017-07-25 Optimal sequencing strategies for identifying disease-associated singletons Rashkin, Sara Jun, Goo Chen, Sai Abecasis, Goncalo R. PLoS Genet Research Article With the increasing focus of genetic association on the identification of trait-associated rare variants through sequencing, it is important to identify the most cost-effective sequencing strategies for these studies. Deep sequencing will accurately detect and genotype the most rare variants per individual, but may limit sample size. Low pass sequencing will miss some variants in each individual but has been shown to provide a cost-effective alternative for studies of common variants. Here, we investigate the impact of sequencing depth on studies of rare variants, focusing on singletons—the variants that are sampled in a single individual and are hardest to detect at low sequencing depths. We first estimate the sensitivity to detect singleton variants in both simulated data and in down-sampled deep genome and exome sequence data. We then explore the power of association studies comparing burden of singleton variants in cases and controls under a variety of conditions. We show that the power to detect singletons increases with coverage, typically plateauing for coverage > ~25x. Next, we show that, when total sequencing capacity is fixed, the power of association studies focused on singletons is typically maximized for coverage of 15-20x, independent of relative risk, disease prevalence, singleton burden, and case-control ratio. Our results suggest sequencing depth of 15-20x as an appropriate compromise of singleton detection power and sample size for studies of rare variants in complex disease. Public Library of Science 2017-06-22 /pmc/articles/PMC5501675/ /pubmed/28640830 http://dx.doi.org/10.1371/journal.pgen.1006811 Text en © 2017 Rashkin et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Rashkin, Sara Jun, Goo Chen, Sai Abecasis, Goncalo R. Optimal sequencing strategies for identifying disease-associated singletons |
title | Optimal sequencing strategies for identifying disease-associated singletons |
title_full | Optimal sequencing strategies for identifying disease-associated singletons |
title_fullStr | Optimal sequencing strategies for identifying disease-associated singletons |
title_full_unstemmed | Optimal sequencing strategies for identifying disease-associated singletons |
title_short | Optimal sequencing strategies for identifying disease-associated singletons |
title_sort | optimal sequencing strategies for identifying disease-associated singletons |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5501675/ https://www.ncbi.nlm.nih.gov/pubmed/28640830 http://dx.doi.org/10.1371/journal.pgen.1006811 |
work_keys_str_mv | AT rashkinsara optimalsequencingstrategiesforidentifyingdiseaseassociatedsingletons AT jungoo optimalsequencingstrategiesforidentifyingdiseaseassociatedsingletons AT chensai optimalsequencingstrategiesforidentifyingdiseaseassociatedsingletons AT optimalsequencingstrategiesforidentifyingdiseaseassociatedsingletons AT abecasisgoncalor optimalsequencingstrategiesforidentifyingdiseaseassociatedsingletons |