Cargando…

MethylPCA: a toolkit to control for confounders in methylome-wide association studies

BACKGROUND: In methylome-wide association studies (MWAS) there are many possible differences between cases and controls (e.g. related to life style, diet, and medication use) that may affect the methylome and produce false positive findings. An effective approach to control for these confounders is...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Wenan, Gao, Guimin, Nerella, Srilaxmi, Hultman, Christina M, Magnusson, Patrik KE, Sullivan, Patrick F, Aberg, Karolina A, van den Oord, Edwin JCG
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599654/
https://www.ncbi.nlm.nih.gov/pubmed/23452721
http://dx.doi.org/10.1186/1471-2105-14-74
_version_ 1782263012593238016
author Chen, Wenan
Gao, Guimin
Nerella, Srilaxmi
Hultman, Christina M
Magnusson, Patrik KE
Sullivan, Patrick F
Aberg, Karolina A
van den Oord, Edwin JCG
author_facet Chen, Wenan
Gao, Guimin
Nerella, Srilaxmi
Hultman, Christina M
Magnusson, Patrik KE
Sullivan, Patrick F
Aberg, Karolina A
van den Oord, Edwin JCG
author_sort Chen, Wenan
collection PubMed
description BACKGROUND: In methylome-wide association studies (MWAS) there are many possible differences between cases and controls (e.g. related to life style, diet, and medication use) that may affect the methylome and produce false positive findings. An effective approach to control for these confounders is to first capture the major sources of variation in the methylation data and then regress out these components in the association analyses. This approach is, however, computationally very challenging due to the extremely large number of methylation sites in the human genome. RESULT: We introduce MethylPCA that is specifically designed to control for potential confounders in studies where the number of methylation sites is extremely large. MethylPCA offers a complete and flexible data analysis including 1) an adaptive method that performs data reduction prior to PCA by empirically combining methylation data of neighboring sites, 2) an efficient algorithm that performs a principal component analysis (PCA) on the ultra high-dimensional data matrix, and 3) association tests. To accomplish this MethylPCA allows for parallel execution of tasks, uses C++ for CPU and I/O intensive calculations, and stores intermediate results to avoid computing the same statistics multiple times or keeping results in memory. Through simulations and an analysis of a real whole methylome MBD-seq study of 1,500 subjects we show that MethylPCA effectively controls for potential confounders. CONCLUSIONS: MethylPCA provides users a convenient tool to perform MWAS. The software effectively handles the challenge in memory and speed to perform tasks that would be impossible to accomplish using existing software when millions of sites are interrogated with the sample sizes required for MWAS.
format Online
Article
Text
id pubmed-3599654
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35996542013-03-23 MethylPCA: a toolkit to control for confounders in methylome-wide association studies Chen, Wenan Gao, Guimin Nerella, Srilaxmi Hultman, Christina M Magnusson, Patrik KE Sullivan, Patrick F Aberg, Karolina A van den Oord, Edwin JCG BMC Bioinformatics Software BACKGROUND: In methylome-wide association studies (MWAS) there are many possible differences between cases and controls (e.g. related to life style, diet, and medication use) that may affect the methylome and produce false positive findings. An effective approach to control for these confounders is to first capture the major sources of variation in the methylation data and then regress out these components in the association analyses. This approach is, however, computationally very challenging due to the extremely large number of methylation sites in the human genome. RESULT: We introduce MethylPCA that is specifically designed to control for potential confounders in studies where the number of methylation sites is extremely large. MethylPCA offers a complete and flexible data analysis including 1) an adaptive method that performs data reduction prior to PCA by empirically combining methylation data of neighboring sites, 2) an efficient algorithm that performs a principal component analysis (PCA) on the ultra high-dimensional data matrix, and 3) association tests. To accomplish this MethylPCA allows for parallel execution of tasks, uses C++ for CPU and I/O intensive calculations, and stores intermediate results to avoid computing the same statistics multiple times or keeping results in memory. Through simulations and an analysis of a real whole methylome MBD-seq study of 1,500 subjects we show that MethylPCA effectively controls for potential confounders. CONCLUSIONS: MethylPCA provides users a convenient tool to perform MWAS. The software effectively handles the challenge in memory and speed to perform tasks that would be impossible to accomplish using existing software when millions of sites are interrogated with the sample sizes required for MWAS. BioMed Central 2013-03-02 /pmc/articles/PMC3599654/ /pubmed/23452721 http://dx.doi.org/10.1186/1471-2105-14-74 Text en Copyright ©2013 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Chen, Wenan
Gao, Guimin
Nerella, Srilaxmi
Hultman, Christina M
Magnusson, Patrik KE
Sullivan, Patrick F
Aberg, Karolina A
van den Oord, Edwin JCG
MethylPCA: a toolkit to control for confounders in methylome-wide association studies
title MethylPCA: a toolkit to control for confounders in methylome-wide association studies
title_full MethylPCA: a toolkit to control for confounders in methylome-wide association studies
title_fullStr MethylPCA: a toolkit to control for confounders in methylome-wide association studies
title_full_unstemmed MethylPCA: a toolkit to control for confounders in methylome-wide association studies
title_short MethylPCA: a toolkit to control for confounders in methylome-wide association studies
title_sort methylpca: a toolkit to control for confounders in methylome-wide association studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599654/
https://www.ncbi.nlm.nih.gov/pubmed/23452721
http://dx.doi.org/10.1186/1471-2105-14-74
work_keys_str_mv AT chenwenan methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT gaoguimin methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT nerellasrilaxmi methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT hultmanchristinam methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT magnussonpatrikke methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT sullivanpatrickf methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT abergkarolinaa methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies
AT vandenoordedwinjcg methylpcaatoolkittocontrolforconfoundersinmethylomewideassociationstudies