Cargando…

XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets

High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratific...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Yao, Hu, Hao, Bohlender, Ryan J, Hu, Fulan, Chen, Jiun-Sheng, Holt, Carson, Fowler, Jerry, Guthery, Stephen L, Scheet, Paul, Hildebrandt, Michelle A T, Yandell, Mark, Huff, Chad D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5888834/
https://www.ncbi.nlm.nih.gov/pubmed/29294048
http://dx.doi.org/10.1093/nar/gkx1280
_version_ 1783312611738124288
author Yu, Yao
Hu, Hao
Bohlender, Ryan J
Hu, Fulan
Chen, Jiun-Sheng
Holt, Carson
Fowler, Jerry
Guthery, Stephen L
Scheet, Paul
Hildebrandt, Michelle A T
Yandell, Mark
Huff, Chad D
author_facet Yu, Yao
Hu, Hao
Bohlender, Ryan J
Hu, Fulan
Chen, Jiun-Sheng
Holt, Carson
Fowler, Jerry
Guthery, Stephen L
Scheet, Paul
Hildebrandt, Michelle A T
Yandell, Mark
Huff, Chad D
author_sort Yu, Yao
collection PubMed
description High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease–gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits.
format Online
Article
Text
id pubmed-5888834
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58888342018-04-11 XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets Yu, Yao Hu, Hao Bohlender, Ryan J Hu, Fulan Chen, Jiun-Sheng Holt, Carson Fowler, Jerry Guthery, Stephen L Scheet, Paul Hildebrandt, Michelle A T Yandell, Mark Huff, Chad D Nucleic Acids Res Methods Online High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease–gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits. Oxford University Press 2018-04-06 2017-12-23 /pmc/articles/PMC5888834/ /pubmed/29294048 http://dx.doi.org/10.1093/nar/gkx1280 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Yu, Yao
Hu, Hao
Bohlender, Ryan J
Hu, Fulan
Chen, Jiun-Sheng
Holt, Carson
Fowler, Jerry
Guthery, Stephen L
Scheet, Paul
Hildebrandt, Michelle A T
Yandell, Mark
Huff, Chad D
XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
title XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
title_full XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
title_fullStr XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
title_full_unstemmed XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
title_short XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
title_sort xpat: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5888834/
https://www.ncbi.nlm.nih.gov/pubmed/29294048
http://dx.doi.org/10.1093/nar/gkx1280
work_keys_str_mv AT yuyao xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT huhao xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT bohlenderryanj xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT hufulan xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT chenjiunsheng xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT holtcarson xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT fowlerjerry xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT gutherystephenl xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT scheetpaul xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT hildebrandtmichelleat xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT yandellmark xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets
AT huffchadd xpatatoolkittoconductcrossplatformassociationstudieswithheterogeneoussequencingdatasets