Cargando…

FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data

Despite ongoing large-scale population-based whole-genome sequencing (WGS) projects such as the NIH NHLBI TOPMed program and the NHGRI Genome Sequencing Program, WGS-based association analysis of complex traits remains a tremendous challenge due to the large number of rare variants, many of which ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Yiding, Wei, Peng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6508749/
https://www.ncbi.nlm.nih.gov/pubmed/31034468
http://dx.doi.org/10.1371/journal.pgen.1008081
_version_ 1783417123306995712
author Ma, Yiding
Wei, Peng
author_facet Ma, Yiding
Wei, Peng
author_sort Ma, Yiding
collection PubMed
description Despite ongoing large-scale population-based whole-genome sequencing (WGS) projects such as the NIH NHLBI TOPMed program and the NHGRI Genome Sequencing Program, WGS-based association analysis of complex traits remains a tremendous challenge due to the large number of rare variants, many of which are non-trait-associated neutral variants. External biological knowledge, such as functional annotations based on the ENCODE, Epigenomics Roadmap and GTEx projects, may be helpful in distinguishing causal rare variants from neutral ones; however, each functional annotation can only provide certain aspects of the biological functions. Our knowledge for selecting informative annotations a priori is limited, and incorporating non-informative annotations will introduce noise and lose power. We propose FunSPU, a versatile and adaptive test that incorporates multiple biological annotations and is adaptive at both the annotation and variant levels and thus maintains high power even in the presence of noninformative annotations. In addition to extensive simulations, we illustrate our proposed test using the TWINSUK cohort (n = 1,752) of UK10K WGS data based on six functional annotations: CADD, RegulomeDB, FunSeq, Funseq2, GERP++, and GenoSkyline. We identified genome-wide significant genetic loci on chromosome 19 near gene TOMM40 and APOC4-APOC2 associated with low-density lipoprotein (LDL), which are replicated in the UK10K ALSPAC cohort (n = 1,497). These replicated LDL-associated loci were missed by existing rare variant association tests that either ignore external biological information or rely on a single source of biological knowledge. We have implemented the proposed test in an R package “FunSPU”.
format Online
Article
Text
id pubmed-6508749
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-65087492019-05-23 FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data Ma, Yiding Wei, Peng PLoS Genet Research Article Despite ongoing large-scale population-based whole-genome sequencing (WGS) projects such as the NIH NHLBI TOPMed program and the NHGRI Genome Sequencing Program, WGS-based association analysis of complex traits remains a tremendous challenge due to the large number of rare variants, many of which are non-trait-associated neutral variants. External biological knowledge, such as functional annotations based on the ENCODE, Epigenomics Roadmap and GTEx projects, may be helpful in distinguishing causal rare variants from neutral ones; however, each functional annotation can only provide certain aspects of the biological functions. Our knowledge for selecting informative annotations a priori is limited, and incorporating non-informative annotations will introduce noise and lose power. We propose FunSPU, a versatile and adaptive test that incorporates multiple biological annotations and is adaptive at both the annotation and variant levels and thus maintains high power even in the presence of noninformative annotations. In addition to extensive simulations, we illustrate our proposed test using the TWINSUK cohort (n = 1,752) of UK10K WGS data based on six functional annotations: CADD, RegulomeDB, FunSeq, Funseq2, GERP++, and GenoSkyline. We identified genome-wide significant genetic loci on chromosome 19 near gene TOMM40 and APOC4-APOC2 associated with low-density lipoprotein (LDL), which are replicated in the UK10K ALSPAC cohort (n = 1,497). These replicated LDL-associated loci were missed by existing rare variant association tests that either ignore external biological information or rely on a single source of biological knowledge. We have implemented the proposed test in an R package “FunSPU”. Public Library of Science 2019-04-29 /pmc/articles/PMC6508749/ /pubmed/31034468 http://dx.doi.org/10.1371/journal.pgen.1008081 Text en © 2019 Ma, Wei http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ma, Yiding
Wei, Peng
FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
title FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
title_full FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
title_fullStr FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
title_full_unstemmed FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
title_short FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
title_sort funspu: a versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6508749/
https://www.ncbi.nlm.nih.gov/pubmed/31034468
http://dx.doi.org/10.1371/journal.pgen.1008081
work_keys_str_mv AT mayiding funspuaversatileandadaptivemultiplefunctionalannotationbasedassociationtestofwholegenomesequencingdata
AT weipeng funspuaversatileandadaptivemultiplefunctionalannotationbasedassociationtestofwholegenomesequencingdata