Cargando…

A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies

In the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare v...

Descripción completa

Detalles Bibliográficos
Autor principal: Chien, Li-Chu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304615/
https://www.ncbi.nlm.nih.gov/pubmed/32559184
http://dx.doi.org/10.1371/journal.pone.0233847
_version_ 1783548290763063296
author Chien, Li-Chu
author_facet Chien, Li-Chu
author_sort Chien, Li-Chu
collection PubMed
description In the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare variants. Often, in analyzing such studies, potentially confounding factors, such as social and environmental conditions, are required to be involved. Multiple linear regression is the most widely used type of regression analysis when the outcome of interest is quantitative traits. Many statistical tests for identifying genotype-phenotype associations using linear regression rely on the assumption that the traits (or the residuals) of the regression follow a normal distribution. In genomic research, the rank-based inverse normal transformation (INT) is one of the most popular approaches to reach normally distributed traits (or normally distributed residuals). Many researchers believe that applying the INT to the non-normality of the traits (or the non-normality of the residuals) is required for valid inference, because the phenotypic (or residual) outliers and non-normality have the significant influence on both the type I error rate control and statistical power, especially under the situation in rare-variant association testing procedures. Here we propose a test for exploring the association of the rare variant with the quantitative trait by using a fully adjusted full-stage INT. Using simulations we show that the fully adjusted full-stage INT is more appropriate than the existing INT methods, such as the fully adjusted two-stage INT and the INT-based omnibus test, in testing genotype-phenotype associations with rare variants, especially when genotypes are uncorrelated with covariates. The fully adjusted full-stage INT retains the advantages of the fully adjusted two-stage INT and ameliorates the problems of the fully adjusted two-stage INT for analysis of rare variants under non-normality of the trait. We also present theoretical results on these desirable properties. In addition, the two available methods with non-normal traits, the quantile/median regression method and the Yeo-Johnson power transformation, are also included in simulations for comparison with these desirable properties.
format Online
Article
Text
id pubmed-7304615
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-73046152020-06-22 A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies Chien, Li-Chu PLoS One Research Article In the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare variants. Often, in analyzing such studies, potentially confounding factors, such as social and environmental conditions, are required to be involved. Multiple linear regression is the most widely used type of regression analysis when the outcome of interest is quantitative traits. Many statistical tests for identifying genotype-phenotype associations using linear regression rely on the assumption that the traits (or the residuals) of the regression follow a normal distribution. In genomic research, the rank-based inverse normal transformation (INT) is one of the most popular approaches to reach normally distributed traits (or normally distributed residuals). Many researchers believe that applying the INT to the non-normality of the traits (or the non-normality of the residuals) is required for valid inference, because the phenotypic (or residual) outliers and non-normality have the significant influence on both the type I error rate control and statistical power, especially under the situation in rare-variant association testing procedures. Here we propose a test for exploring the association of the rare variant with the quantitative trait by using a fully adjusted full-stage INT. Using simulations we show that the fully adjusted full-stage INT is more appropriate than the existing INT methods, such as the fully adjusted two-stage INT and the INT-based omnibus test, in testing genotype-phenotype associations with rare variants, especially when genotypes are uncorrelated with covariates. The fully adjusted full-stage INT retains the advantages of the fully adjusted two-stage INT and ameliorates the problems of the fully adjusted two-stage INT for analysis of rare variants under non-normality of the trait. We also present theoretical results on these desirable properties. In addition, the two available methods with non-normal traits, the quantile/median regression method and the Yeo-Johnson power transformation, are also included in simulations for comparison with these desirable properties. Public Library of Science 2020-06-19 /pmc/articles/PMC7304615/ /pubmed/32559184 http://dx.doi.org/10.1371/journal.pone.0233847 Text en © 2020 Li-Chu Chien http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Chien, Li-Chu
A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
title A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
title_full A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
title_fullStr A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
title_full_unstemmed A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
title_short A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
title_sort rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304615/
https://www.ncbi.nlm.nih.gov/pubmed/32559184
http://dx.doi.org/10.1371/journal.pone.0233847
work_keys_str_mv AT chienlichu arankbasednormalizationmethodwiththefullyadjustedfullstageprocedureingeneticassociationstudies
AT chienlichu rankbasednormalizationmethodwiththefullyadjustedfullstageprocedureingeneticassociationstudies