Cargando…
Comparison of different cell type correction methods for genome-scale epigenetics studies
BACKGROUND: Whole blood is frequently utilized in genome-wide association studies of DNA methylation patterns in relation to environmental exposures or clinical outcomes. These associations can be confounded by cellular heterogeneity. Algorithms have been developed to measure or adjust for this hete...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5391562/ https://www.ncbi.nlm.nih.gov/pubmed/28410574 http://dx.doi.org/10.1186/s12859-017-1611-2 |
_version_ | 1783229297171890176 |
---|---|
author | Kaushal, Akhilesh Zhang, Hongmei Karmaus, Wilfried J. J. Ray, Meredith Torres, Mylin A. Smith, Alicia K. Wang, Shu-Li |
author_facet | Kaushal, Akhilesh Zhang, Hongmei Karmaus, Wilfried J. J. Ray, Meredith Torres, Mylin A. Smith, Alicia K. Wang, Shu-Li |
author_sort | Kaushal, Akhilesh |
collection | PubMed |
description | BACKGROUND: Whole blood is frequently utilized in genome-wide association studies of DNA methylation patterns in relation to environmental exposures or clinical outcomes. These associations can be confounded by cellular heterogeneity. Algorithms have been developed to measure or adjust for this heterogeneity, and some have been compared in the literature. However, with new methods available, it is unknown whether the findings will be consistent, if not which method(s) perform better. RESULTS: Methods: We compared eight cell-type correction methods including the method in the minfi R package, the method by Houseman et al., the Removing unwanted variation (RUV) approach, the methods in FaST-LMM-EWASher, ReFACTor, RefFreeEWAS, and RefFreeCellMix R programs, along with one approach utilizing surrogate variables (SVAs). We first evaluated the association of DNA methylation at each CpG across the whole genome with prenatal arsenic exposure levels and with cancer status, adjusted for estimated cell-type information obtained from different methods. We then compared CpGs showing statistical significance from different approaches. For the methods implemented in minfi and proposed by Houseman et al., we utilized homogeneous data with composition of some blood cells available and compared them with the estimated cell compositions. Finally, for methods not explicitly estimating cell compositions, we evaluated their performance using simulated DNA methylation data with a set of latent variables representing “cell types”. Results: Results from the SVA-based method overall showed the highest agreement with all other methods except for FaST-LMM-EWASher. Using homogeneous data, minfi provided better estimations on cell types compared to the originally proposed method by Houseman et al. Further simulation studies on methods free of reference data revealed that SVA provided good sensitivities and specificities, RefFreeCellMix in general produced high sensitivities but specificities tended to be low when confounding is present, and FaST-LMM-EWASher gave the lowest sensitivity but highest specificity. CONCLUSIONS: Results from real data and simulations indicated that SVA is recommended when the focus is on the identification of informative CpGs. When appropriate reference data are available, the method implemented in the minfi package is recommended. However, if no such reference data are available or if the focus is not on estimating cell proportions, the SVA method is suggested. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1611-2) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5391562 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-53915622017-04-14 Comparison of different cell type correction methods for genome-scale epigenetics studies Kaushal, Akhilesh Zhang, Hongmei Karmaus, Wilfried J. J. Ray, Meredith Torres, Mylin A. Smith, Alicia K. Wang, Shu-Li BMC Bioinformatics Research Article BACKGROUND: Whole blood is frequently utilized in genome-wide association studies of DNA methylation patterns in relation to environmental exposures or clinical outcomes. These associations can be confounded by cellular heterogeneity. Algorithms have been developed to measure or adjust for this heterogeneity, and some have been compared in the literature. However, with new methods available, it is unknown whether the findings will be consistent, if not which method(s) perform better. RESULTS: Methods: We compared eight cell-type correction methods including the method in the minfi R package, the method by Houseman et al., the Removing unwanted variation (RUV) approach, the methods in FaST-LMM-EWASher, ReFACTor, RefFreeEWAS, and RefFreeCellMix R programs, along with one approach utilizing surrogate variables (SVAs). We first evaluated the association of DNA methylation at each CpG across the whole genome with prenatal arsenic exposure levels and with cancer status, adjusted for estimated cell-type information obtained from different methods. We then compared CpGs showing statistical significance from different approaches. For the methods implemented in minfi and proposed by Houseman et al., we utilized homogeneous data with composition of some blood cells available and compared them with the estimated cell compositions. Finally, for methods not explicitly estimating cell compositions, we evaluated their performance using simulated DNA methylation data with a set of latent variables representing “cell types”. Results: Results from the SVA-based method overall showed the highest agreement with all other methods except for FaST-LMM-EWASher. Using homogeneous data, minfi provided better estimations on cell types compared to the originally proposed method by Houseman et al. Further simulation studies on methods free of reference data revealed that SVA provided good sensitivities and specificities, RefFreeCellMix in general produced high sensitivities but specificities tended to be low when confounding is present, and FaST-LMM-EWASher gave the lowest sensitivity but highest specificity. CONCLUSIONS: Results from real data and simulations indicated that SVA is recommended when the focus is on the identification of informative CpGs. When appropriate reference data are available, the method implemented in the minfi package is recommended. However, if no such reference data are available or if the focus is not on estimating cell proportions, the SVA method is suggested. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1611-2) contains supplementary material, which is available to authorized users. BioMed Central 2017-04-14 /pmc/articles/PMC5391562/ /pubmed/28410574 http://dx.doi.org/10.1186/s12859-017-1611-2 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Kaushal, Akhilesh Zhang, Hongmei Karmaus, Wilfried J. J. Ray, Meredith Torres, Mylin A. Smith, Alicia K. Wang, Shu-Li Comparison of different cell type correction methods for genome-scale epigenetics studies |
title | Comparison of different cell type correction methods for genome-scale epigenetics studies |
title_full | Comparison of different cell type correction methods for genome-scale epigenetics studies |
title_fullStr | Comparison of different cell type correction methods for genome-scale epigenetics studies |
title_full_unstemmed | Comparison of different cell type correction methods for genome-scale epigenetics studies |
title_short | Comparison of different cell type correction methods for genome-scale epigenetics studies |
title_sort | comparison of different cell type correction methods for genome-scale epigenetics studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5391562/ https://www.ncbi.nlm.nih.gov/pubmed/28410574 http://dx.doi.org/10.1186/s12859-017-1611-2 |
work_keys_str_mv | AT kaushalakhilesh comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies AT zhanghongmei comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies AT karmauswilfriedjj comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies AT raymeredith comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies AT torresmylina comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies AT smithaliciak comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies AT wangshuli comparisonofdifferentcelltypecorrectionmethodsforgenomescaleepigeneticsstudies |