Cargando…

Variant calling in low-coverage whole genome sequencing of a Native American population sample

BACKGROUND: The reduction in the cost of sequencing a human genome has led to the use of genotype sampling strategies in order to impute and infer the presence of sequence variants that can then be tested for associations with traits of interest. Low-coverage Whole Genome Sequencing (WGS) is a sampl...

Descripción completa

Detalles Bibliográficos
Autores principales: Bizon, Chris, Spiegel, Michael, Chasse, Scott A, Gizer, Ian R, Li, Yun, Malc, Ewa P, Mieczkowski, Piotr A, Sailsbery, Josh K, Wang, Xiaoshu, Ehlers, Cindy L, Wilhelmsen, Kirk C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914019/
https://www.ncbi.nlm.nih.gov/pubmed/24479562
http://dx.doi.org/10.1186/1471-2164-15-85
_version_ 1782302326161145856
author Bizon, Chris
Spiegel, Michael
Chasse, Scott A
Gizer, Ian R
Li, Yun
Malc, Ewa P
Mieczkowski, Piotr A
Sailsbery, Josh K
Wang, Xiaoshu
Ehlers, Cindy L
Wilhelmsen, Kirk C
author_facet Bizon, Chris
Spiegel, Michael
Chasse, Scott A
Gizer, Ian R
Li, Yun
Malc, Ewa P
Mieczkowski, Piotr A
Sailsbery, Josh K
Wang, Xiaoshu
Ehlers, Cindy L
Wilhelmsen, Kirk C
author_sort Bizon, Chris
collection PubMed
description BACKGROUND: The reduction in the cost of sequencing a human genome has led to the use of genotype sampling strategies in order to impute and infer the presence of sequence variants that can then be tested for associations with traits of interest. Low-coverage Whole Genome Sequencing (WGS) is a sampling strategy that overcomes some of the deficiencies seen in fixed content SNP array studies. Linkage-disequilibrium (LD) aware variant callers, such as the program Thunder, may provide a calling rate and accuracy that makes a low-coverage sequencing strategy viable. RESULTS: We examined the performance of an LD-aware variant calling strategy in a population of 708 low-coverage whole genome sequences from a community sample of Native Americans. We assessed variant calling through a comparison of the sequencing results to genotypes measured in 641 of the same subjects using a fixed content first generation exome array. The comparison was made using the variant calling routines GATK Unified Genotyper program and the LD-aware variant caller Thunder. Thunder was found to improve concordance in a coverage dependent fashion, while correctly calling nearly all of the common variants as well as a high percentage of the rare variants present in the sample. CONCLUSIONS: Low-coverage WGS is a strategy that appears to collect genetic information intermediate in scope between fixed content genotyping arrays and deep-coverage WGS. Our data suggests that low-coverage WGS is a viable strategy with a greater chance of discovering novel variants and associations than fixed content arrays for large sample association analyses.
format Online
Article
Text
id pubmed-3914019
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39140192014-02-14 Variant calling in low-coverage whole genome sequencing of a Native American population sample Bizon, Chris Spiegel, Michael Chasse, Scott A Gizer, Ian R Li, Yun Malc, Ewa P Mieczkowski, Piotr A Sailsbery, Josh K Wang, Xiaoshu Ehlers, Cindy L Wilhelmsen, Kirk C BMC Genomics Research Article BACKGROUND: The reduction in the cost of sequencing a human genome has led to the use of genotype sampling strategies in order to impute and infer the presence of sequence variants that can then be tested for associations with traits of interest. Low-coverage Whole Genome Sequencing (WGS) is a sampling strategy that overcomes some of the deficiencies seen in fixed content SNP array studies. Linkage-disequilibrium (LD) aware variant callers, such as the program Thunder, may provide a calling rate and accuracy that makes a low-coverage sequencing strategy viable. RESULTS: We examined the performance of an LD-aware variant calling strategy in a population of 708 low-coverage whole genome sequences from a community sample of Native Americans. We assessed variant calling through a comparison of the sequencing results to genotypes measured in 641 of the same subjects using a fixed content first generation exome array. The comparison was made using the variant calling routines GATK Unified Genotyper program and the LD-aware variant caller Thunder. Thunder was found to improve concordance in a coverage dependent fashion, while correctly calling nearly all of the common variants as well as a high percentage of the rare variants present in the sample. CONCLUSIONS: Low-coverage WGS is a strategy that appears to collect genetic information intermediate in scope between fixed content genotyping arrays and deep-coverage WGS. Our data suggests that low-coverage WGS is a viable strategy with a greater chance of discovering novel variants and associations than fixed content arrays for large sample association analyses. BioMed Central 2014-01-30 /pmc/articles/PMC3914019/ /pubmed/24479562 http://dx.doi.org/10.1186/1471-2164-15-85 Text en Copyright © 2014 Bizon et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Research Article
Bizon, Chris
Spiegel, Michael
Chasse, Scott A
Gizer, Ian R
Li, Yun
Malc, Ewa P
Mieczkowski, Piotr A
Sailsbery, Josh K
Wang, Xiaoshu
Ehlers, Cindy L
Wilhelmsen, Kirk C
Variant calling in low-coverage whole genome sequencing of a Native American population sample
title Variant calling in low-coverage whole genome sequencing of a Native American population sample
title_full Variant calling in low-coverage whole genome sequencing of a Native American population sample
title_fullStr Variant calling in low-coverage whole genome sequencing of a Native American population sample
title_full_unstemmed Variant calling in low-coverage whole genome sequencing of a Native American population sample
title_short Variant calling in low-coverage whole genome sequencing of a Native American population sample
title_sort variant calling in low-coverage whole genome sequencing of a native american population sample
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914019/
https://www.ncbi.nlm.nih.gov/pubmed/24479562
http://dx.doi.org/10.1186/1471-2164-15-85
work_keys_str_mv AT bizonchris variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT spiegelmichael variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT chassescotta variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT gizerianr variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT liyun variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT malcewap variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT mieczkowskipiotra variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT sailsberyjoshk variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT wangxiaoshu variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT ehlerscindyl variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample
AT wilhelmsenkirkc variantcallinginlowcoveragewholegenomesequencingofanativeamericanpopulationsample