Cargando…

Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data

Family-based study design will play a key role in identifying rare causal variants, because rare causal variants can be enriched in families with multiple affected subjects. Furthermore, different from population-based studies, family studies are robust to bias induced by population substructure. It...

Descripción completa

Detalles Bibliográficos
Autor principal: Lin, Wan-Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4277421/
https://www.ncbi.nlm.nih.gov/pubmed/25541952
http://dx.doi.org/10.1371/journal.pone.0115971
_version_ 1782350395034566656
author Lin, Wan-Yu
author_facet Lin, Wan-Yu
author_sort Lin, Wan-Yu
collection PubMed
description Family-based study design will play a key role in identifying rare causal variants, because rare causal variants can be enriched in families with multiple affected subjects. Furthermore, different from population-based studies, family studies are robust to bias induced by population substructure. It is well known that rare causal variants are difficult to detect from single-locus tests. Therefore, burden tests and non-burden tests have been developed, by combining signals of multiple variants in a chromosomal region or a functional unit. This inevitably incorporates some neutral variants into the test statistics, which can dilute the power of statistical methods. To guard against the noise caused by neutral variants, we here propose an ‘adaptive combination of P-values method’ (abbreviated as ‘ADA’). This method combines per-site P-values of variants that are more likely to be causal. Variants with large P-values (which are more likely to be neutral variants) are discarded from the combined statistic. In addition to performing extensive simulation studies, we applied these tests to the Genetic Analysis Workshop 17 data sets, where real sequence data were generated according to the 1000 Genomes Project. Compared with some existing methods, ADA is more robust to the inclusion of neutral variants. This is a merit especially when dichotomous traits are analyzed. However, there are some limitations for ADA. First, it is more computationally intensive. Second, pedigree structures and founders' sequence data are required for the permutation procedure. Third, unrelated controls cannot be included. We here show that, for family-based studies, the application of ADA is limited to dichotomous trait analyses with full pedigree information.
format Online
Article
Text
id pubmed-4277421
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42774212014-12-31 Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data Lin, Wan-Yu PLoS One Research Article Family-based study design will play a key role in identifying rare causal variants, because rare causal variants can be enriched in families with multiple affected subjects. Furthermore, different from population-based studies, family studies are robust to bias induced by population substructure. It is well known that rare causal variants are difficult to detect from single-locus tests. Therefore, burden tests and non-burden tests have been developed, by combining signals of multiple variants in a chromosomal region or a functional unit. This inevitably incorporates some neutral variants into the test statistics, which can dilute the power of statistical methods. To guard against the noise caused by neutral variants, we here propose an ‘adaptive combination of P-values method’ (abbreviated as ‘ADA’). This method combines per-site P-values of variants that are more likely to be causal. Variants with large P-values (which are more likely to be neutral variants) are discarded from the combined statistic. In addition to performing extensive simulation studies, we applied these tests to the Genetic Analysis Workshop 17 data sets, where real sequence data were generated according to the 1000 Genomes Project. Compared with some existing methods, ADA is more robust to the inclusion of neutral variants. This is a merit especially when dichotomous traits are analyzed. However, there are some limitations for ADA. First, it is more computationally intensive. Second, pedigree structures and founders' sequence data are required for the permutation procedure. Third, unrelated controls cannot be included. We here show that, for family-based studies, the application of ADA is limited to dichotomous trait analyses with full pedigree information. Public Library of Science 2014-12-26 /pmc/articles/PMC4277421/ /pubmed/25541952 http://dx.doi.org/10.1371/journal.pone.0115971 Text en © 2014 Wan-Yu Lin http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lin, Wan-Yu
Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data
title Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data
title_full Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data
title_fullStr Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data
title_full_unstemmed Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data
title_short Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data
title_sort adaptive combination of p-values for family-based association testing with sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4277421/
https://www.ncbi.nlm.nih.gov/pubmed/25541952
http://dx.doi.org/10.1371/journal.pone.0115971
work_keys_str_mv AT linwanyu adaptivecombinationofpvaluesforfamilybasedassociationtestingwithsequencedata