Cargando…

Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions

When testing association between rare variants and diseases, an efficient analytical approach involves considering a set of variants in a genomic region as the unit of analysis. One factor complicating this approach is that the vast majority of rare variants in practical applications are believed to...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lei, Pei, Yu-Fang, Li, Jian, Papasian, Christopher J., Deng, Hong-Wen
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3000820/
https://www.ncbi.nlm.nih.gov/pubmed/21170328
http://dx.doi.org/10.1371/journal.pone.0014288
_version_ 1782193568858767360
author Zhang, Lei
Pei, Yu-Fang
Li, Jian
Papasian, Christopher J.
Deng, Hong-Wen
author_facet Zhang, Lei
Pei, Yu-Fang
Li, Jian
Papasian, Christopher J.
Deng, Hong-Wen
author_sort Zhang, Lei
collection PubMed
description When testing association between rare variants and diseases, an efficient analytical approach involves considering a set of variants in a genomic region as the unit of analysis. One factor complicating this approach is that the vast majority of rare variants in practical applications are believed to represent background neutral variation. As a result, analyzing a single set with all variants may not represent a powerful approach. Here, we propose two alternative strategies. In the first, we analyze the subsets of rare variants exhaustively. In the second, we categorize variants selectively into two subsets: one in which variants are overrepresented in cases, and the other in which variants are overrepresented in controls. When the proportion of neutral variants is moderate to large we show, by simulations, that the both proposed strategies improve the statistical power over methods analyzing a single set with total variants. When applied to a real sequencing association study, the proposed methods consistently produce smaller p-values than their competitors. When applied to another real sequencing dataset to study the difference of rare allele distributions between ethnic populations, the proposed methods detect the overrepresentation of variants between the CHB (Chinese Han in Beijing) and YRI (Yoruba people of Ibadan) populations with small p-values. Additional analyses suggest that there is no difference between the CHB and CHD (Chinese Han in Denver) datasets, as expected. Finally, when applied to the CHB and JPT (Japanese people in Tokyo) populations, existing methods fail to detect any difference, while it is detected by the proposed methods in several regions.
format Text
id pubmed-3000820
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30008202010-12-17 Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions Zhang, Lei Pei, Yu-Fang Li, Jian Papasian, Christopher J. Deng, Hong-Wen PLoS One Research Article When testing association between rare variants and diseases, an efficient analytical approach involves considering a set of variants in a genomic region as the unit of analysis. One factor complicating this approach is that the vast majority of rare variants in practical applications are believed to represent background neutral variation. As a result, analyzing a single set with all variants may not represent a powerful approach. Here, we propose two alternative strategies. In the first, we analyze the subsets of rare variants exhaustively. In the second, we categorize variants selectively into two subsets: one in which variants are overrepresented in cases, and the other in which variants are overrepresented in controls. When the proportion of neutral variants is moderate to large we show, by simulations, that the both proposed strategies improve the statistical power over methods analyzing a single set with total variants. When applied to a real sequencing association study, the proposed methods consistently produce smaller p-values than their competitors. When applied to another real sequencing dataset to study the difference of rare allele distributions between ethnic populations, the proposed methods detect the overrepresentation of variants between the CHB (Chinese Han in Beijing) and YRI (Yoruba people of Ibadan) populations with small p-values. Additional analyses suggest that there is no difference between the CHB and CHD (Chinese Han in Denver) datasets, as expected. Finally, when applied to the CHB and JPT (Japanese people in Tokyo) populations, existing methods fail to detect any difference, while it is detected by the proposed methods in several regions. Public Library of Science 2010-12-10 /pmc/articles/PMC3000820/ /pubmed/21170328 http://dx.doi.org/10.1371/journal.pone.0014288 Text en Zhang et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhang, Lei
Pei, Yu-Fang
Li, Jian
Papasian, Christopher J.
Deng, Hong-Wen
Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
title Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
title_full Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
title_fullStr Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
title_full_unstemmed Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
title_short Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
title_sort efficient utilization of rare variants for detection of disease-related genomic regions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3000820/
https://www.ncbi.nlm.nih.gov/pubmed/21170328
http://dx.doi.org/10.1371/journal.pone.0014288
work_keys_str_mv AT zhanglei efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions
AT peiyufang efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions
AT lijian efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions
AT papasianchristopherj efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions
AT denghongwen efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions