Cargando…
Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions
When testing association between rare variants and diseases, an efficient analytical approach involves considering a set of variants in a genomic region as the unit of analysis. One factor complicating this approach is that the vast majority of rare variants in practical applications are believed to...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3000820/ https://www.ncbi.nlm.nih.gov/pubmed/21170328 http://dx.doi.org/10.1371/journal.pone.0014288 |
_version_ | 1782193568858767360 |
---|---|
author | Zhang, Lei Pei, Yu-Fang Li, Jian Papasian, Christopher J. Deng, Hong-Wen |
author_facet | Zhang, Lei Pei, Yu-Fang Li, Jian Papasian, Christopher J. Deng, Hong-Wen |
author_sort | Zhang, Lei |
collection | PubMed |
description | When testing association between rare variants and diseases, an efficient analytical approach involves considering a set of variants in a genomic region as the unit of analysis. One factor complicating this approach is that the vast majority of rare variants in practical applications are believed to represent background neutral variation. As a result, analyzing a single set with all variants may not represent a powerful approach. Here, we propose two alternative strategies. In the first, we analyze the subsets of rare variants exhaustively. In the second, we categorize variants selectively into two subsets: one in which variants are overrepresented in cases, and the other in which variants are overrepresented in controls. When the proportion of neutral variants is moderate to large we show, by simulations, that the both proposed strategies improve the statistical power over methods analyzing a single set with total variants. When applied to a real sequencing association study, the proposed methods consistently produce smaller p-values than their competitors. When applied to another real sequencing dataset to study the difference of rare allele distributions between ethnic populations, the proposed methods detect the overrepresentation of variants between the CHB (Chinese Han in Beijing) and YRI (Yoruba people of Ibadan) populations with small p-values. Additional analyses suggest that there is no difference between the CHB and CHD (Chinese Han in Denver) datasets, as expected. Finally, when applied to the CHB and JPT (Japanese people in Tokyo) populations, existing methods fail to detect any difference, while it is detected by the proposed methods in several regions. |
format | Text |
id | pubmed-3000820 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-30008202010-12-17 Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions Zhang, Lei Pei, Yu-Fang Li, Jian Papasian, Christopher J. Deng, Hong-Wen PLoS One Research Article When testing association between rare variants and diseases, an efficient analytical approach involves considering a set of variants in a genomic region as the unit of analysis. One factor complicating this approach is that the vast majority of rare variants in practical applications are believed to represent background neutral variation. As a result, analyzing a single set with all variants may not represent a powerful approach. Here, we propose two alternative strategies. In the first, we analyze the subsets of rare variants exhaustively. In the second, we categorize variants selectively into two subsets: one in which variants are overrepresented in cases, and the other in which variants are overrepresented in controls. When the proportion of neutral variants is moderate to large we show, by simulations, that the both proposed strategies improve the statistical power over methods analyzing a single set with total variants. When applied to a real sequencing association study, the proposed methods consistently produce smaller p-values than their competitors. When applied to another real sequencing dataset to study the difference of rare allele distributions between ethnic populations, the proposed methods detect the overrepresentation of variants between the CHB (Chinese Han in Beijing) and YRI (Yoruba people of Ibadan) populations with small p-values. Additional analyses suggest that there is no difference between the CHB and CHD (Chinese Han in Denver) datasets, as expected. Finally, when applied to the CHB and JPT (Japanese people in Tokyo) populations, existing methods fail to detect any difference, while it is detected by the proposed methods in several regions. Public Library of Science 2010-12-10 /pmc/articles/PMC3000820/ /pubmed/21170328 http://dx.doi.org/10.1371/journal.pone.0014288 Text en Zhang et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Zhang, Lei Pei, Yu-Fang Li, Jian Papasian, Christopher J. Deng, Hong-Wen Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions |
title | Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions |
title_full | Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions |
title_fullStr | Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions |
title_full_unstemmed | Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions |
title_short | Efficient Utilization of Rare Variants for Detection of Disease-Related Genomic Regions |
title_sort | efficient utilization of rare variants for detection of disease-related genomic regions |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3000820/ https://www.ncbi.nlm.nih.gov/pubmed/21170328 http://dx.doi.org/10.1371/journal.pone.0014288 |
work_keys_str_mv | AT zhanglei efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions AT peiyufang efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions AT lijian efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions AT papasianchristopherj efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions AT denghongwen efficientutilizationofrarevariantsfordetectionofdiseaserelatedgenomicregions |