Cargando…
Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores
BACKGROUND: Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants (“monogenic”) or by the cumulative effect of numerous common variants (“polygenic”). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components of...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6880438/ https://www.ncbi.nlm.nih.gov/pubmed/31771638 http://dx.doi.org/10.1186/s13073-019-0682-2 |
_version_ | 1783473760626540544 |
---|---|
author | Homburger, Julian R. Neben, Cynthia L. Mishne, Gilad Zhou, Alicia Y. Kathiresan, Sekar Khera, Amit V. |
author_facet | Homburger, Julian R. Neben, Cynthia L. Mishne, Gilad Zhou, Alicia Y. Kathiresan, Sekar Khera, Amit V. |
author_sort | Homburger, Julian R. |
collection | PubMed |
description | BACKGROUND: Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants (“monogenic”) or by the cumulative effect of numerous common variants (“polygenic”). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components of inherited risk. The traditional approach requires two distinct genetic testing technologies—high coverage sequencing of known genes to detect monogenic variants and a genome-wide genotyping array followed by imputation to calculate genome-wide polygenic scores (GPSs). We assessed the feasibility and accuracy of using low coverage whole genome sequencing (lcWGS) as an alternative to genotyping arrays to calculate GPSs. METHODS: First, we performed downsampling and imputation of WGS data from ten individuals to assess concordance with known genotypes. Second, we assessed the correlation between GPSs for 3 common diseases—coronary artery disease (CAD), breast cancer (BC), and atrial fibrillation (AF)—calculated using lcWGS and genotyping array in 184 samples. Third, we assessed concordance of lcWGS-based genotype calls and GPS calculation in 120 individuals with known genotypes, selected to reflect diverse ancestral backgrounds. Fourth, we assessed the relationship between GPSs calculated using lcWGS and disease phenotypes in a cohort of 11,502 individuals of European ancestry. RESULTS: We found imputation accuracy r(2) values of greater than 0.90 for all ten samples—including those of African and Ashkenazi Jewish ancestry—with lcWGS data at 0.5×. GPSs calculated using lcWGS and genotyping array followed by imputation in 184 individuals were highly correlated for each of the 3 common diseases (r(2) = 0.93–0.97) with similar score distributions. Using lcWGS data from 120 individuals of diverse ancestral backgrounds, we found similar results with respect to imputation accuracy and GPS correlations. Finally, we calculated GPSs for CAD, BC, and AF using lcWGS in 11,502 individuals of European ancestry, confirming odds ratios per standard deviation increment ranging 1.28 to 1.59, consistent with previous studies. CONCLUSIONS: lcWGS is an alternative technology to genotyping arrays for common genetic variant assessment and GPS calculation. lcWGS provides comparable imputation accuracy while also overcoming the ascertainment bias inherent to variant selection in genotyping array design. |
format | Online Article Text |
id | pubmed-6880438 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68804382019-11-29 Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores Homburger, Julian R. Neben, Cynthia L. Mishne, Gilad Zhou, Alicia Y. Kathiresan, Sekar Khera, Amit V. Genome Med Research BACKGROUND: Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants (“monogenic”) or by the cumulative effect of numerous common variants (“polygenic”). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components of inherited risk. The traditional approach requires two distinct genetic testing technologies—high coverage sequencing of known genes to detect monogenic variants and a genome-wide genotyping array followed by imputation to calculate genome-wide polygenic scores (GPSs). We assessed the feasibility and accuracy of using low coverage whole genome sequencing (lcWGS) as an alternative to genotyping arrays to calculate GPSs. METHODS: First, we performed downsampling and imputation of WGS data from ten individuals to assess concordance with known genotypes. Second, we assessed the correlation between GPSs for 3 common diseases—coronary artery disease (CAD), breast cancer (BC), and atrial fibrillation (AF)—calculated using lcWGS and genotyping array in 184 samples. Third, we assessed concordance of lcWGS-based genotype calls and GPS calculation in 120 individuals with known genotypes, selected to reflect diverse ancestral backgrounds. Fourth, we assessed the relationship between GPSs calculated using lcWGS and disease phenotypes in a cohort of 11,502 individuals of European ancestry. RESULTS: We found imputation accuracy r(2) values of greater than 0.90 for all ten samples—including those of African and Ashkenazi Jewish ancestry—with lcWGS data at 0.5×. GPSs calculated using lcWGS and genotyping array followed by imputation in 184 individuals were highly correlated for each of the 3 common diseases (r(2) = 0.93–0.97) with similar score distributions. Using lcWGS data from 120 individuals of diverse ancestral backgrounds, we found similar results with respect to imputation accuracy and GPS correlations. Finally, we calculated GPSs for CAD, BC, and AF using lcWGS in 11,502 individuals of European ancestry, confirming odds ratios per standard deviation increment ranging 1.28 to 1.59, consistent with previous studies. CONCLUSIONS: lcWGS is an alternative technology to genotyping arrays for common genetic variant assessment and GPS calculation. lcWGS provides comparable imputation accuracy while also overcoming the ascertainment bias inherent to variant selection in genotyping array design. BioMed Central 2019-11-26 /pmc/articles/PMC6880438/ /pubmed/31771638 http://dx.doi.org/10.1186/s13073-019-0682-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Homburger, Julian R. Neben, Cynthia L. Mishne, Gilad Zhou, Alicia Y. Kathiresan, Sekar Khera, Amit V. Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
title | Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
title_full | Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
title_fullStr | Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
title_full_unstemmed | Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
title_short | Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
title_sort | low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6880438/ https://www.ncbi.nlm.nih.gov/pubmed/31771638 http://dx.doi.org/10.1186/s13073-019-0682-2 |
work_keys_str_mv | AT homburgerjulianr lowcoveragewholegenomesequencingenablesaccurateassessmentofcommonvariantsandcalculationofgenomewidepolygenicscores AT nebencynthial lowcoveragewholegenomesequencingenablesaccurateassessmentofcommonvariantsandcalculationofgenomewidepolygenicscores AT mishnegilad lowcoveragewholegenomesequencingenablesaccurateassessmentofcommonvariantsandcalculationofgenomewidepolygenicscores AT zhoualiciay lowcoveragewholegenomesequencingenablesaccurateassessmentofcommonvariantsandcalculationofgenomewidepolygenicscores AT kathiresansekar lowcoveragewholegenomesequencingenablesaccurateassessmentofcommonvariantsandcalculationofgenomewidepolygenicscores AT kheraamitv lowcoveragewholegenomesequencingenablesaccurateassessmentofcommonvariantsandcalculationofgenomewidepolygenicscores |