Cargando…

recountmethylation enables flexible analysis of public blood DNA methylation array data

SUMMARY: Thousands of DNA methylation (DNAm) array samples from human blood are publicly available on the Gene Expression Omnibus (GEO), but they remain underutilized for experiment planning, replication and cross-study and cross-platform analyses. To facilitate these tasks, we augmented our recount...

Descripción completa

Detalles Bibliográficos
Autores principales: Maden, Sean K, Walsh, Brian, Ellrott, Kyle, Hansen, Kasper D, Thompson, Reid F, Nellore, Abhinav
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976962/
https://www.ncbi.nlm.nih.gov/pubmed/36874953
http://dx.doi.org/10.1093/bioadv/vbad020
_version_ 1784899189566603264
author Maden, Sean K
Walsh, Brian
Ellrott, Kyle
Hansen, Kasper D
Thompson, Reid F
Nellore, Abhinav
author_facet Maden, Sean K
Walsh, Brian
Ellrott, Kyle
Hansen, Kasper D
Thompson, Reid F
Nellore, Abhinav
author_sort Maden, Sean K
collection PubMed
description SUMMARY: Thousands of DNA methylation (DNAm) array samples from human blood are publicly available on the Gene Expression Omnibus (GEO), but they remain underutilized for experiment planning, replication and cross-study and cross-platform analyses. To facilitate these tasks, we augmented our recountmethylation R/Bioconductor package with 12 537 uniformly processed EPIC and HM450K blood samples on GEO as well as several new features. We subsequently used our updated package in several illustrative analyses, finding (i) study ID bias adjustment increased variation explained by biological and demographic variables, (ii) most variation in autosomal DNAm was explained by genetic ancestry and CD4+ T-cell fractions and (iii) the dependence of power to detect differential methylation on sample size was similar for each of peripheral blood mononuclear cells (PBMC), whole blood and umbilical cord blood. Finally, we used PBMC and whole blood to perform independent validations, and we recovered 38–46% of differentially methylated probes between sexes from two previously published epigenome-wide association studies. AVAILABILITY AND IMPLEMENTATION: Source code to reproduce the main results are available on GitHub (repo: recountmethylation_flexible-blood-analysis_manuscript; url: https://github.com/metamaden/recountmethylation_flexible-blood-analysis_manuscript). All data was publicly available and downloaded from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/). Compilations of the analyzed public data can be accessed from the website recount.bio/data (preprocessed HM450K array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/; preprocessed EPIC array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9976962
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-99769622023-03-02 recountmethylation enables flexible analysis of public blood DNA methylation array data Maden, Sean K Walsh, Brian Ellrott, Kyle Hansen, Kasper D Thompson, Reid F Nellore, Abhinav Bioinform Adv Original Paper SUMMARY: Thousands of DNA methylation (DNAm) array samples from human blood are publicly available on the Gene Expression Omnibus (GEO), but they remain underutilized for experiment planning, replication and cross-study and cross-platform analyses. To facilitate these tasks, we augmented our recountmethylation R/Bioconductor package with 12 537 uniformly processed EPIC and HM450K blood samples on GEO as well as several new features. We subsequently used our updated package in several illustrative analyses, finding (i) study ID bias adjustment increased variation explained by biological and demographic variables, (ii) most variation in autosomal DNAm was explained by genetic ancestry and CD4+ T-cell fractions and (iii) the dependence of power to detect differential methylation on sample size was similar for each of peripheral blood mononuclear cells (PBMC), whole blood and umbilical cord blood. Finally, we used PBMC and whole blood to perform independent validations, and we recovered 38–46% of differentially methylated probes between sexes from two previously published epigenome-wide association studies. AVAILABILITY AND IMPLEMENTATION: Source code to reproduce the main results are available on GitHub (repo: recountmethylation_flexible-blood-analysis_manuscript; url: https://github.com/metamaden/recountmethylation_flexible-blood-analysis_manuscript). All data was publicly available and downloaded from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/). Compilations of the analyzed public data can be accessed from the website recount.bio/data (preprocessed HM450K array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/; preprocessed EPIC array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-02-20 /pmc/articles/PMC9976962/ /pubmed/36874953 http://dx.doi.org/10.1093/bioadv/vbad020 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Maden, Sean K
Walsh, Brian
Ellrott, Kyle
Hansen, Kasper D
Thompson, Reid F
Nellore, Abhinav
recountmethylation enables flexible analysis of public blood DNA methylation array data
title recountmethylation enables flexible analysis of public blood DNA methylation array data
title_full recountmethylation enables flexible analysis of public blood DNA methylation array data
title_fullStr recountmethylation enables flexible analysis of public blood DNA methylation array data
title_full_unstemmed recountmethylation enables flexible analysis of public blood DNA methylation array data
title_short recountmethylation enables flexible analysis of public blood DNA methylation array data
title_sort recountmethylation enables flexible analysis of public blood dna methylation array data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976962/
https://www.ncbi.nlm.nih.gov/pubmed/36874953
http://dx.doi.org/10.1093/bioadv/vbad020
work_keys_str_mv AT madenseank recountmethylationenablesflexibleanalysisofpublicblooddnamethylationarraydata
AT walshbrian recountmethylationenablesflexibleanalysisofpublicblooddnamethylationarraydata
AT ellrottkyle recountmethylationenablesflexibleanalysisofpublicblooddnamethylationarraydata
AT hansenkasperd recountmethylationenablesflexibleanalysisofpublicblooddnamethylationarraydata
AT thompsonreidf recountmethylationenablesflexibleanalysisofpublicblooddnamethylationarraydata
AT nelloreabhinav recountmethylationenablesflexibleanalysisofpublicblooddnamethylationarraydata