Cargando…

Using Incomplete Trios to Boost Confidence in Family Based Association Studies

Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to co...

Descripción completa

Detalles Bibliográficos
Autores principales: Dhankani, Varsha, Gibbs, David L., Knijnenburg, Theo, Kramer, Roger, Vockley, Joseph, Niederhuber, John, Shmulevich, Ilya, Bernard, Brady
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4796035/
https://www.ncbi.nlm.nih.gov/pubmed/27047537
http://dx.doi.org/10.3389/fgene.2016.00034
_version_ 1782421699246948352
author Dhankani, Varsha
Gibbs, David L.
Knijnenburg, Theo
Kramer, Roger
Vockley, Joseph
Niederhuber, John
Shmulevich, Ilya
Bernard, Brady
author_facet Dhankani, Varsha
Gibbs, David L.
Knijnenburg, Theo
Kramer, Roger
Vockley, Joseph
Niederhuber, John
Shmulevich, Ilya
Bernard, Brady
author_sort Dhankani, Varsha
collection PubMed
description Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to collect data for more families and from large family cohorts with the goal of improving statistical power. However, due to missing genotypes, many families are not included in the family based association tests, negating the benefits of large scale sequencing data. Here, we present the CIFBAT method to use incomplete families in Family Based Association Test (FBAT) to evaluate robustness against missing data. CIFBAT uses quantile intervals of the FBAT statistic by randomly choosing valid completions of incomplete family genotypes based on Mendelian inheritance rules. By considering all valid completions equally likely and computing quantile intervals over many randomized iterations, CIFBAT avoids assumption of a homogeneous population structure or any particular missingness pattern in the data. Using simulated data, we show that the quantile intervals computed by CIFBAT are useful in validating robustness of the FBAT statistic against missing data and in identifying genomic markers with higher precision. We also propose a novel set of candidate genomic markers for uterine related abnormalities from analysis of familial whole genome sequences, and provide validation for a previously established set of candidate markers for Type 1 diabetes. We have provided a software package that incorporates TDT, robustTDT, FBAT, and CIFBAT. The data format proposed for the software uses half the memory space that the standard FBAT format (PED) files use, making it efficient for large scale genome wide association studies.
format Online
Article
Text
id pubmed-4796035
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-47960352016-04-04 Using Incomplete Trios to Boost Confidence in Family Based Association Studies Dhankani, Varsha Gibbs, David L. Knijnenburg, Theo Kramer, Roger Vockley, Joseph Niederhuber, John Shmulevich, Ilya Bernard, Brady Front Genet Genetics Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to collect data for more families and from large family cohorts with the goal of improving statistical power. However, due to missing genotypes, many families are not included in the family based association tests, negating the benefits of large scale sequencing data. Here, we present the CIFBAT method to use incomplete families in Family Based Association Test (FBAT) to evaluate robustness against missing data. CIFBAT uses quantile intervals of the FBAT statistic by randomly choosing valid completions of incomplete family genotypes based on Mendelian inheritance rules. By considering all valid completions equally likely and computing quantile intervals over many randomized iterations, CIFBAT avoids assumption of a homogeneous population structure or any particular missingness pattern in the data. Using simulated data, we show that the quantile intervals computed by CIFBAT are useful in validating robustness of the FBAT statistic against missing data and in identifying genomic markers with higher precision. We also propose a novel set of candidate genomic markers for uterine related abnormalities from analysis of familial whole genome sequences, and provide validation for a previously established set of candidate markers for Type 1 diabetes. We have provided a software package that incorporates TDT, robustTDT, FBAT, and CIFBAT. The data format proposed for the software uses half the memory space that the standard FBAT format (PED) files use, making it efficient for large scale genome wide association studies. Frontiers Media S.A. 2016-03-18 /pmc/articles/PMC4796035/ /pubmed/27047537 http://dx.doi.org/10.3389/fgene.2016.00034 Text en Copyright © 2016 Dhankani, Gibbs, Knijnenburg, Kramer, Vockley, Niederhuber, Shmulevich and Bernard. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Dhankani, Varsha
Gibbs, David L.
Knijnenburg, Theo
Kramer, Roger
Vockley, Joseph
Niederhuber, John
Shmulevich, Ilya
Bernard, Brady
Using Incomplete Trios to Boost Confidence in Family Based Association Studies
title Using Incomplete Trios to Boost Confidence in Family Based Association Studies
title_full Using Incomplete Trios to Boost Confidence in Family Based Association Studies
title_fullStr Using Incomplete Trios to Boost Confidence in Family Based Association Studies
title_full_unstemmed Using Incomplete Trios to Boost Confidence in Family Based Association Studies
title_short Using Incomplete Trios to Boost Confidence in Family Based Association Studies
title_sort using incomplete trios to boost confidence in family based association studies
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4796035/
https://www.ncbi.nlm.nih.gov/pubmed/27047537
http://dx.doi.org/10.3389/fgene.2016.00034
work_keys_str_mv AT dhankanivarsha usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT gibbsdavidl usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT knijnenburgtheo usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT kramerroger usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT vockleyjoseph usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT niederhuberjohn usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT shmulevichilya usingincompletetriostoboostconfidenceinfamilybasedassociationstudies
AT bernardbrady usingincompletetriostoboostconfidenceinfamilybasedassociationstudies