Cargando…
Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective
BACKGROUND: The impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4392865/ https://www.ncbi.nlm.nih.gov/pubmed/25887114 http://dx.doi.org/10.1186/s12859-015-0527-y |
_version_ | 1782366057989668864 |
---|---|
author | Houseman, E Andres Kelsey, Karl T Wiencke, John K Marsit, Carmen J |
author_facet | Houseman, E Andres Kelsey, Karl T Wiencke, John K Marsit, Carmen J |
author_sort | Houseman, E Andres |
collection | PubMed |
description | BACKGROUND: The impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there is increasing interest in methods that adjust for cell composition effects when reference sets are incomplete or unavailable. RESULTS: In this article we present a theoretical basis for one such method, showing that the total effect of a phenotype on DNA methylation can be decomposed into orthogonal components, one representing the effect of phenotype on proportions of major cell types, the other representing either subtle effects in composition or global effects at focused loci, and that it is possible to separate these two types of effects in a finite data set. We demonstrate this principle empirically on nine DNA methylation data sets, showing that the first few principal components generally contain a majority of the information on cell-type present in the data, but that later principal components nevertheless contain information about a small number of loci that may represent more focused associations. We also present a new method for determining the number of linear terms to interpret as cell-mixture effects and demonstrate robustness to the choice of this parameter. CONCLUSIONS: Taken together, our work demonstrates that reference-free algorithms for cell-mixture adjustment can produce biologically valid results, separating cell-mediated epigenetic effects (i.e. apparent effects arising from differences in cell composition) from those that are not cell mediated, and that in general the interpretation of associations evident from DNA methylation should be carefully considered. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0527-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4392865 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-43928652015-04-11 Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective Houseman, E Andres Kelsey, Karl T Wiencke, John K Marsit, Carmen J BMC Bioinformatics Methodology Article BACKGROUND: The impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there is increasing interest in methods that adjust for cell composition effects when reference sets are incomplete or unavailable. RESULTS: In this article we present a theoretical basis for one such method, showing that the total effect of a phenotype on DNA methylation can be decomposed into orthogonal components, one representing the effect of phenotype on proportions of major cell types, the other representing either subtle effects in composition or global effects at focused loci, and that it is possible to separate these two types of effects in a finite data set. We demonstrate this principle empirically on nine DNA methylation data sets, showing that the first few principal components generally contain a majority of the information on cell-type present in the data, but that later principal components nevertheless contain information about a small number of loci that may represent more focused associations. We also present a new method for determining the number of linear terms to interpret as cell-mixture effects and demonstrate robustness to the choice of this parameter. CONCLUSIONS: Taken together, our work demonstrates that reference-free algorithms for cell-mixture adjustment can produce biologically valid results, separating cell-mediated epigenetic effects (i.e. apparent effects arising from differences in cell composition) from those that are not cell mediated, and that in general the interpretation of associations evident from DNA methylation should be carefully considered. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0527-y) contains supplementary material, which is available to authorized users. BioMed Central 2015-03-21 /pmc/articles/PMC4392865/ /pubmed/25887114 http://dx.doi.org/10.1186/s12859-015-0527-y Text en © Houseman et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Houseman, E Andres Kelsey, Karl T Wiencke, John K Marsit, Carmen J Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective |
title | Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective |
title_full | Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective |
title_fullStr | Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective |
title_full_unstemmed | Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective |
title_short | Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective |
title_sort | cell-composition effects in the analysis of dna methylation array data: a mathematical perspective |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4392865/ https://www.ncbi.nlm.nih.gov/pubmed/25887114 http://dx.doi.org/10.1186/s12859-015-0527-y |
work_keys_str_mv | AT housemaneandres cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective AT kelseykarlt cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective AT wienckejohnk cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective AT marsitcarmenj cellcompositioneffectsintheanalysisofdnamethylationarraydataamathematicalperspective |