Cargando…

Multiple Regression Methods Show Great Potential for Rare Variant Association Tests

The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few year...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, ChangJiang, Ladouceur, Martin, Dastani, Zari, Richards, J. Brent, Ciampi, Antonio, Greenwood, Celia M. T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3420665/
https://www.ncbi.nlm.nih.gov/pubmed/22916111
http://dx.doi.org/10.1371/journal.pone.0041694
_version_ 1782240895999934464
author Xu, ChangJiang
Ladouceur, Martin
Dastani, Zari
Richards, J. Brent
Ciampi, Antonio
Greenwood, Celia M. T.
author_facet Xu, ChangJiang
Ladouceur, Martin
Dastani, Zari
Richards, J. Brent
Ciampi, Antonio
Greenwood, Celia M. T.
author_sort Xu, ChangJiang
collection PubMed
description The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.
format Online
Article
Text
id pubmed-3420665
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34206652012-08-22 Multiple Regression Methods Show Great Potential for Rare Variant Association Tests Xu, ChangJiang Ladouceur, Martin Dastani, Zari Richards, J. Brent Ciampi, Antonio Greenwood, Celia M. T. PLoS One Research Article The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene. Public Library of Science 2012-08-08 /pmc/articles/PMC3420665/ /pubmed/22916111 http://dx.doi.org/10.1371/journal.pone.0041694 Text en © 2012 Xu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xu, ChangJiang
Ladouceur, Martin
Dastani, Zari
Richards, J. Brent
Ciampi, Antonio
Greenwood, Celia M. T.
Multiple Regression Methods Show Great Potential for Rare Variant Association Tests
title Multiple Regression Methods Show Great Potential for Rare Variant Association Tests
title_full Multiple Regression Methods Show Great Potential for Rare Variant Association Tests
title_fullStr Multiple Regression Methods Show Great Potential for Rare Variant Association Tests
title_full_unstemmed Multiple Regression Methods Show Great Potential for Rare Variant Association Tests
title_short Multiple Regression Methods Show Great Potential for Rare Variant Association Tests
title_sort multiple regression methods show great potential for rare variant association tests
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3420665/
https://www.ncbi.nlm.nih.gov/pubmed/22916111
http://dx.doi.org/10.1371/journal.pone.0041694
work_keys_str_mv AT xuchangjiang multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT ladouceurmartin multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT dastanizari multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT richardsjbrent multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT ciampiantonio multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT greenwoodceliamt multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests