Cargando…

Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective

BACKGROUND: The impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there...

Descripción completa

Detalles Bibliográficos
Autores principales: Houseman, E Andres, Kelsey, Karl T, Wiencke, John K, Marsit, Carmen J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4392865/
https://www.ncbi.nlm.nih.gov/pubmed/25887114
http://dx.doi.org/10.1186/s12859-015-0527-y
_version_ 1782366057989668864
author Houseman, E Andres
Kelsey, Karl T
Wiencke, John K
Marsit, Carmen J
author_facet Houseman, E Andres
Kelsey, Karl T
Wiencke, John K
Marsit, Carmen J
author_sort Houseman, E Andres
collection PubMed
description BACKGROUND: The impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there is increasing interest in methods that adjust for cell composition effects when reference sets are incomplete or unavailable. RESULTS: In this article we present a theoretical basis for one such method, showing that the total effect of a phenotype on DNA methylation can be decomposed into orthogonal components, one representing the effect of phenotype on proportions of major cell types, the other representing either subtle effects in composition or global effects at focused loci, and that it is possible to separate these two types of effects in a finite data set. We demonstrate this principle empirically on nine DNA methylation data sets, showing that the first few principal components generally contain a majority of the information on cell-type present in the data, but that later principal components nevertheless contain information about a small number of loci that may represent more focused associations. We also present a new method for determining the number of linear terms to interpret as cell-mixture effects and demonstrate robustness to the choice of this parameter. CONCLUSIONS: Taken together, our work demonstrates that reference-free algorithms for cell-mixture adjustment can produce biologically valid results, separating cell-mediated epigenetic effects (i.e. apparent effects arising from differences in cell composition) from those that are not cell mediated, and that in general the interpretation of associations evident from DNA methylation should be carefully considered. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0527-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4392865
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43928652015-04-11 Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective Houseman, E Andres Kelsey, Karl T Wiencke, John K Marsit, Carmen J BMC Bioinformatics Methodology Article BACKGROUND: The impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there is increasing interest in methods that adjust for cell composition effects when reference sets are incomplete or unavailable. RESULTS: In this article we present a theoretical basis for one such method, showing that the total effect of a phenotype on DNA methylation can be decomposed into orthogonal components, one representing the effect of phenotype on proportions of major cell types, the other representing either subtle effects in composition or global effects at focused loci, and that it is possible to separate these two types of effects in a finite data set. We demonstrate this principle empirically on nine DNA methylation data sets, showing that the first few principal components generally contain a majority of the information on cell-type present in the data, but that later principal components nevertheless contain information about a small number of loci that may represent more focused associations. We also present a new method for determining the number of linear terms to interpret as cell-mixture effects and demonstrate robustness to the choice of this parameter. CONCLUSIONS: Taken together, our work demonstrates that reference-free algorithms for cell-mixture adjustment can produce biologically valid results, separating cell-mediated epigenetic effects (i.e. apparent effects arising from differences in cell composition) from those that are not cell mediated, and that in general the interpretation of associations evident from DNA methylation should be carefully considered. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0527-y) contains supplementary material, which is available to authorized users. BioMed Central 2015-03-21 /pmc/articles/PMC4392865/ /pubmed/25887114 http://dx.doi.org/10.1186/s12859-015-0527-y Text en © Houseman et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Houseman, E Andres
Kelsey, Karl T
Wiencke, John K
Marsit, Carmen J
Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
title Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
title_full Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
title_fullStr Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
title_full_unstemmed Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
title_short Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
title_sort cell-composition effects in the analysis of dna methylation array data: a mathematical perspective
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4392865/
https://www.ncbi.nlm.nih.gov/pubmed/25887114
http://dx.doi.org/10.1186/s12859-015-0527-y
work_keys_str_mv AT housemaneandres cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective
AT kelseykarlt cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective
AT wienckejohnk cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective
AT marsitcarmenj cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective