Cargando…
Using Incomplete Trios to Boost Confidence in Family Based Association Studies
Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to co...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4796035/ https://www.ncbi.nlm.nih.gov/pubmed/27047537 http://dx.doi.org/10.3389/fgene.2016.00034 |
_version_ | 1782421699246948352 |
---|---|
author | Dhankani, Varsha Gibbs, David L. Knijnenburg, Theo Kramer, Roger Vockley, Joseph Niederhuber, John Shmulevich, Ilya Bernard, Brady |
author_facet | Dhankani, Varsha Gibbs, David L. Knijnenburg, Theo Kramer, Roger Vockley, Joseph Niederhuber, John Shmulevich, Ilya Bernard, Brady |
author_sort | Dhankani, Varsha |
collection | PubMed |
description | Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to collect data for more families and from large family cohorts with the goal of improving statistical power. However, due to missing genotypes, many families are not included in the family based association tests, negating the benefits of large scale sequencing data. Here, we present the CIFBAT method to use incomplete families in Family Based Association Test (FBAT) to evaluate robustness against missing data. CIFBAT uses quantile intervals of the FBAT statistic by randomly choosing valid completions of incomplete family genotypes based on Mendelian inheritance rules. By considering all valid completions equally likely and computing quantile intervals over many randomized iterations, CIFBAT avoids assumption of a homogeneous population structure or any particular missingness pattern in the data. Using simulated data, we show that the quantile intervals computed by CIFBAT are useful in validating robustness of the FBAT statistic against missing data and in identifying genomic markers with higher precision. We also propose a novel set of candidate genomic markers for uterine related abnormalities from analysis of familial whole genome sequences, and provide validation for a previously established set of candidate markers for Type 1 diabetes. We have provided a software package that incorporates TDT, robustTDT, FBAT, and CIFBAT. The data format proposed for the software uses half the memory space that the standard FBAT format (PED) files use, making it efficient for large scale genome wide association studies. |
format | Online Article Text |
id | pubmed-4796035 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-47960352016-04-04 Using Incomplete Trios to Boost Confidence in Family Based Association Studies Dhankani, Varsha Gibbs, David L. Knijnenburg, Theo Kramer, Roger Vockley, Joseph Niederhuber, John Shmulevich, Ilya Bernard, Brady Front Genet Genetics Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to collect data for more families and from large family cohorts with the goal of improving statistical power. However, due to missing genotypes, many families are not included in the family based association tests, negating the benefits of large scale sequencing data. Here, we present the CIFBAT method to use incomplete families in Family Based Association Test (FBAT) to evaluate robustness against missing data. CIFBAT uses quantile intervals of the FBAT statistic by randomly choosing valid completions of incomplete family genotypes based on Mendelian inheritance rules. By considering all valid completions equally likely and computing quantile intervals over many randomized iterations, CIFBAT avoids assumption of a homogeneous population structure or any particular missingness pattern in the data. Using simulated data, we show that the quantile intervals computed by CIFBAT are useful in validating robustness of the FBAT statistic against missing data and in identifying genomic markers with higher precision. We also propose a novel set of candidate genomic markers for uterine related abnormalities from analysis of familial whole genome sequences, and provide validation for a previously established set of candidate markers for Type 1 diabetes. We have provided a software package that incorporates TDT, robustTDT, FBAT, and CIFBAT. The data format proposed for the software uses half the memory space that the standard FBAT format (PED) files use, making it efficient for large scale genome wide association studies. Frontiers Media S.A. 2016-03-18 /pmc/articles/PMC4796035/ /pubmed/27047537 http://dx.doi.org/10.3389/fgene.2016.00034 Text en Copyright © 2016 Dhankani, Gibbs, Knijnenburg, Kramer, Vockley, Niederhuber, Shmulevich and Bernard. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Dhankani, Varsha Gibbs, David L. Knijnenburg, Theo Kramer, Roger Vockley, Joseph Niederhuber, John Shmulevich, Ilya Bernard, Brady Using Incomplete Trios to Boost Confidence in Family Based Association Studies |
title | Using Incomplete Trios to Boost Confidence in Family Based Association Studies |
title_full | Using Incomplete Trios to Boost Confidence in Family Based Association Studies |
title_fullStr | Using Incomplete Trios to Boost Confidence in Family Based Association Studies |
title_full_unstemmed | Using Incomplete Trios to Boost Confidence in Family Based Association Studies |
title_short | Using Incomplete Trios to Boost Confidence in Family Based Association Studies |
title_sort | using incomplete trios to boost confidence in family based association studies |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4796035/ https://www.ncbi.nlm.nih.gov/pubmed/27047537 http://dx.doi.org/10.3389/fgene.2016.00034 |
work_keys_str_mv | AT dhankanivarsha usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT gibbsdavidl usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT knijnenburgtheo usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT kramerroger usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT vockleyjoseph usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT niederhuberjohn usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT shmulevichilya usingincompletetriostoboostconfidenceinfamilybasedassociationstudies AT bernardbrady usingincompletetriostoboostconfidenceinfamilybasedassociationstudies |