Cargando…
The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines
BACKGROUND: Illumina DNA methylation arrays are high-throughput platforms for cost-effective genome-wide profiling of individual CpGs. Experimental and technical factors introduce appreciable measurement variation, some of which can be mitigated by careful “preprocessing” of raw data. METHODS: Here...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8662917/ https://www.ncbi.nlm.nih.gov/pubmed/34886879 http://dx.doi.org/10.1186/s13148-021-01207-1 |
_version_ | 1784613536849199104 |
---|---|
author | Xu, Zongli Niu, Liang Taylor, Jack A. |
author_facet | Xu, Zongli Niu, Liang Taylor, Jack A. |
author_sort | Xu, Zongli |
collection | PubMed |
description | BACKGROUND: Illumina DNA methylation arrays are high-throughput platforms for cost-effective genome-wide profiling of individual CpGs. Experimental and technical factors introduce appreciable measurement variation, some of which can be mitigated by careful “preprocessing” of raw data. METHODS: Here we describe the ENmix preprocessing pipeline and compare it to a set of seven published alternative pipelines (ChAMP, Illumina, SWAN, Funnorm, Noob, wateRmelon, and RnBeads). We use two large sets of duplicate sample measurements with 450 K and EPIC arrays, along with mixtures of isogenic methylated and unmethylated cell line DNA to compare raw data and that preprocessed via different pipelines. RESULTS: Our evaluations show that the ENmix pipeline performs the best with significantly higher correlation and lower absolute difference between duplicate pairs, higher intraclass correlation coefficients (ICC) and smaller deviations from expected methylation level in mixture experiments. In addition to the pipeline function, ENmix software provides an integrated set of functions for reading in raw data files from mouse and human arrays, quality control, data preprocessing, visualization, detection of differentially methylated regions (DMRs), estimation of cell type proportions, and calculation of methylation age clocks. ENmix is computationally efficient, flexible and allows parallel computing. To facilitate further evaluations, we make all datasets and evaluation code publicly available. CONCLUSION: Careful selection of robust data preprocessing methods is critical for DNA methylation array studies. ENmix outperformed other pipelines in our evaluations to minimize experimental variation and to improve data quality and study power. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13148-021-01207-1. |
format | Online Article Text |
id | pubmed-8662917 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86629172021-12-13 The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines Xu, Zongli Niu, Liang Taylor, Jack A. Clin Epigenetics Research BACKGROUND: Illumina DNA methylation arrays are high-throughput platforms for cost-effective genome-wide profiling of individual CpGs. Experimental and technical factors introduce appreciable measurement variation, some of which can be mitigated by careful “preprocessing” of raw data. METHODS: Here we describe the ENmix preprocessing pipeline and compare it to a set of seven published alternative pipelines (ChAMP, Illumina, SWAN, Funnorm, Noob, wateRmelon, and RnBeads). We use two large sets of duplicate sample measurements with 450 K and EPIC arrays, along with mixtures of isogenic methylated and unmethylated cell line DNA to compare raw data and that preprocessed via different pipelines. RESULTS: Our evaluations show that the ENmix pipeline performs the best with significantly higher correlation and lower absolute difference between duplicate pairs, higher intraclass correlation coefficients (ICC) and smaller deviations from expected methylation level in mixture experiments. In addition to the pipeline function, ENmix software provides an integrated set of functions for reading in raw data files from mouse and human arrays, quality control, data preprocessing, visualization, detection of differentially methylated regions (DMRs), estimation of cell type proportions, and calculation of methylation age clocks. ENmix is computationally efficient, flexible and allows parallel computing. To facilitate further evaluations, we make all datasets and evaluation code publicly available. CONCLUSION: Careful selection of robust data preprocessing methods is critical for DNA methylation array studies. ENmix outperformed other pipelines in our evaluations to minimize experimental variation and to improve data quality and study power. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13148-021-01207-1. BioMed Central 2021-12-09 /pmc/articles/PMC8662917/ /pubmed/34886879 http://dx.doi.org/10.1186/s13148-021-01207-1 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Xu, Zongli Niu, Liang Taylor, Jack A. The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines |
title | The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines |
title_full | The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines |
title_fullStr | The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines |
title_full_unstemmed | The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines |
title_short | The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines |
title_sort | enmix dna methylation analysis pipeline for illumina beadchip and comparisons with seven other preprocessing pipelines |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8662917/ https://www.ncbi.nlm.nih.gov/pubmed/34886879 http://dx.doi.org/10.1186/s13148-021-01207-1 |
work_keys_str_mv | AT xuzongli theenmixdnamethylationanalysispipelineforilluminabeadchipandcomparisonswithsevenotherpreprocessingpipelines AT niuliang theenmixdnamethylationanalysispipelineforilluminabeadchipandcomparisonswithsevenotherpreprocessingpipelines AT taylorjacka theenmixdnamethylationanalysispipelineforilluminabeadchipandcomparisonswithsevenotherpreprocessingpipelines AT xuzongli enmixdnamethylationanalysispipelineforilluminabeadchipandcomparisonswithsevenotherpreprocessingpipelines AT niuliang enmixdnamethylationanalysispipelineforilluminabeadchipandcomparisonswithsevenotherpreprocessingpipelines AT taylorjacka enmixdnamethylationanalysispipelineforilluminabeadchipandcomparisonswithsevenotherpreprocessingpipelines |