Cargando…
A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required
The advent of next generation sequencing (NGS) technologies enabled the investigation of the rare variant-common disease hypothesis in unrelated individuals, even on the genome-wide level. Analysis of this hypothesis requires tailored statistical methods as single marker tests fail on rare variants....
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4164031/ https://www.ncbi.nlm.nih.gov/pubmed/25309579 http://dx.doi.org/10.3389/fgene.2014.00323 |
_version_ | 1782334903215456256 |
---|---|
author | Dering, Carmen König, Inke R. Ramsey, Laura B. Relling, Mary V. Yang, Wenjian Ziegler, Andreas |
author_facet | Dering, Carmen König, Inke R. Ramsey, Laura B. Relling, Mary V. Yang, Wenjian Ziegler, Andreas |
author_sort | Dering, Carmen |
collection | PubMed |
description | The advent of next generation sequencing (NGS) technologies enabled the investigation of the rare variant-common disease hypothesis in unrelated individuals, even on the genome-wide level. Analysis of this hypothesis requires tailored statistical methods as single marker tests fail on rare variants. An entire class of statistical methods collapses rare variants from a genomic region of interest (ROI), thereby aggregating rare variants. In an extensive simulation study using data from the Genetic Analysis Workshop 17 we compared the performance of 15 collapsing methods by means of a variety of pre-defined ROIs regarding minor allele frequency thresholds and functionality. Findings of the simulation study were additionally confirmed by a real data set investigating the association between methotrexate clearance and the SLCO1B1 gene in patients with acute lymphoblastic leukemia. Our analyses showed substantially inflated type I error levels for many of the proposed collapsing methods. Only four approaches yielded valid type I errors in all considered scenarios. None of the statistical tests was able to detect true associations over a substantial proportion of replicates in the simulated data. Detailed annotation of functionality of variants is crucial to detect true associations. These findings were confirmed in the analysis of the real data. Recent theoretical work showed that large power is achieved in gene-based analyses only if large sample sizes are available and a substantial proportion of causing rare variants is present in the gene-based analysis. Many of the investigated statistical approaches use permutation requiring high computational cost. There is a clear need for valid, powerful and fast to calculate test statistics for studies investigating rare variants. |
format | Online Article Text |
id | pubmed-4164031 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-41640312014-10-10 A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required Dering, Carmen König, Inke R. Ramsey, Laura B. Relling, Mary V. Yang, Wenjian Ziegler, Andreas Front Genet Genetics The advent of next generation sequencing (NGS) technologies enabled the investigation of the rare variant-common disease hypothesis in unrelated individuals, even on the genome-wide level. Analysis of this hypothesis requires tailored statistical methods as single marker tests fail on rare variants. An entire class of statistical methods collapses rare variants from a genomic region of interest (ROI), thereby aggregating rare variants. In an extensive simulation study using data from the Genetic Analysis Workshop 17 we compared the performance of 15 collapsing methods by means of a variety of pre-defined ROIs regarding minor allele frequency thresholds and functionality. Findings of the simulation study were additionally confirmed by a real data set investigating the association between methotrexate clearance and the SLCO1B1 gene in patients with acute lymphoblastic leukemia. Our analyses showed substantially inflated type I error levels for many of the proposed collapsing methods. Only four approaches yielded valid type I errors in all considered scenarios. None of the statistical tests was able to detect true associations over a substantial proportion of replicates in the simulated data. Detailed annotation of functionality of variants is crucial to detect true associations. These findings were confirmed in the analysis of the real data. Recent theoretical work showed that large power is achieved in gene-based analyses only if large sample sizes are available and a substantial proportion of causing rare variants is present in the gene-based analysis. Many of the investigated statistical approaches use permutation requiring high computational cost. There is a clear need for valid, powerful and fast to calculate test statistics for studies investigating rare variants. Frontiers Media S.A. 2014-09-15 /pmc/articles/PMC4164031/ /pubmed/25309579 http://dx.doi.org/10.3389/fgene.2014.00323 Text en Copyright © 2014 Dering, König, Ramsey, Relling, Yang and Ziegler. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Dering, Carmen König, Inke R. Ramsey, Laura B. Relling, Mary V. Yang, Wenjian Ziegler, Andreas A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
title | A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
title_full | A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
title_fullStr | A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
title_full_unstemmed | A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
title_short | A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
title_sort | comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4164031/ https://www.ncbi.nlm.nih.gov/pubmed/25309579 http://dx.doi.org/10.3389/fgene.2014.00323 |
work_keys_str_mv | AT deringcarmen acomprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT koniginker acomprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT ramseylaurab acomprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT rellingmaryv acomprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT yangwenjian acomprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT zieglerandreas acomprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT deringcarmen comprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT koniginker comprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT ramseylaurab comprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT rellingmaryv comprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT yangwenjian comprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired AT zieglerandreas comprehensiveevaluationofcollapsingmethodsusingsimulatedandrealdataexcellentannotationoffunctionalityandlargesamplesizesrequired |