Cargando…

Human methylome variation across Infinium 450K data on the Gene Expression Omnibus

While DNA methylation (DNAm) is the most-studied epigenetic mark, few recent studies probe the breadth of publicly available DNAm array samples. We collectively analyzed 35 360 Illumina Infinium HumanMethylation450K DNAm array samples published on the Gene Expression Omnibus. We learned a controlled...

Descripción completa

Detalles Bibliográficos
Autores principales: Maden, Sean K, Thompson, Reid F, Hansen, Kasper D, Nellore, Abhinav
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8061458/
https://www.ncbi.nlm.nih.gov/pubmed/33937763
http://dx.doi.org/10.1093/nargab/lqab025
_version_ 1783681570484256768
author Maden, Sean K
Thompson, Reid F
Hansen, Kasper D
Nellore, Abhinav
author_facet Maden, Sean K
Thompson, Reid F
Hansen, Kasper D
Nellore, Abhinav
author_sort Maden, Sean K
collection PubMed
description While DNA methylation (DNAm) is the most-studied epigenetic mark, few recent studies probe the breadth of publicly available DNAm array samples. We collectively analyzed 35 360 Illumina Infinium HumanMethylation450K DNAm array samples published on the Gene Expression Omnibus. We learned a controlled vocabulary of sample labels by applying regular expressions to metadata and used existing models to predict various sample properties including epigenetic age. We found approximately two-thirds of samples were from blood, one-quarter were from brain and one-third were from cancer patients. About 19% of samples failed at least one of Illumina’s 17 prescribed quality assessments; signal distributions across samples suggest modifying manufacturer-recommended thresholds for failure would make these assessments more informative. We further analyzed DNAm variances in seven tissues (adipose, nasal, blood, brain, buccal, sperm and liver) and characterized specific probes distinguishing them. Finally, we compiled DNAm array data and metadata, including our learned and predicted sample labels, into database files accessible via the recountmethylation R/Bioconductor companion package. Its vignettes walk the user through some analyses contained in this paper.
format Online
Article
Text
id pubmed-8061458
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-80614582021-04-29 Human methylome variation across Infinium 450K data on the Gene Expression Omnibus Maden, Sean K Thompson, Reid F Hansen, Kasper D Nellore, Abhinav NAR Genom Bioinform Standard Article While DNA methylation (DNAm) is the most-studied epigenetic mark, few recent studies probe the breadth of publicly available DNAm array samples. We collectively analyzed 35 360 Illumina Infinium HumanMethylation450K DNAm array samples published on the Gene Expression Omnibus. We learned a controlled vocabulary of sample labels by applying regular expressions to metadata and used existing models to predict various sample properties including epigenetic age. We found approximately two-thirds of samples were from blood, one-quarter were from brain and one-third were from cancer patients. About 19% of samples failed at least one of Illumina’s 17 prescribed quality assessments; signal distributions across samples suggest modifying manufacturer-recommended thresholds for failure would make these assessments more informative. We further analyzed DNAm variances in seven tissues (adipose, nasal, blood, brain, buccal, sperm and liver) and characterized specific probes distinguishing them. Finally, we compiled DNAm array data and metadata, including our learned and predicted sample labels, into database files accessible via the recountmethylation R/Bioconductor companion package. Its vignettes walk the user through some analyses contained in this paper. Oxford University Press 2021-04-22 /pmc/articles/PMC8061458/ /pubmed/33937763 http://dx.doi.org/10.1093/nargab/lqab025 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Standard Article
Maden, Sean K
Thompson, Reid F
Hansen, Kasper D
Nellore, Abhinav
Human methylome variation across Infinium 450K data on the Gene Expression Omnibus
title Human methylome variation across Infinium 450K data on the Gene Expression Omnibus
title_full Human methylome variation across Infinium 450K data on the Gene Expression Omnibus
title_fullStr Human methylome variation across Infinium 450K data on the Gene Expression Omnibus
title_full_unstemmed Human methylome variation across Infinium 450K data on the Gene Expression Omnibus
title_short Human methylome variation across Infinium 450K data on the Gene Expression Omnibus
title_sort human methylome variation across infinium 450k data on the gene expression omnibus
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8061458/
https://www.ncbi.nlm.nih.gov/pubmed/33937763
http://dx.doi.org/10.1093/nargab/lqab025
work_keys_str_mv AT madenseank humanmethylomevariationacrossinfinium450kdataonthegeneexpressionomnibus
AT thompsonreidf humanmethylomevariationacrossinfinium450kdataonthegeneexpressionomnibus
AT hansenkasperd humanmethylomevariationacrossinfinium450kdataonthegeneexpressionomnibus
AT nelloreabhinav humanmethylomevariationacrossinfinium450kdataonthegeneexpressionomnibus