Cargando…

Bigmelon: tools for analysing large DNA methylation datasets

MOTIVATION: The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result th...

Descripción completa

Detalles Bibliográficos
Autores principales: Gorrie-Stone, Tyler J, Smart, Melissa C, Saffari, Ayden, Malki, Karim, Hannon, Eilis, Burrage, Joe, Mill, Jonathan, Kumari, Meena, Schalkwyk, Leonard C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419913/
https://www.ncbi.nlm.nih.gov/pubmed/30875430
http://dx.doi.org/10.1093/bioinformatics/bty713
_version_ 1783404024328880128
author Gorrie-Stone, Tyler J
Smart, Melissa C
Saffari, Ayden
Malki, Karim
Hannon, Eilis
Burrage, Joe
Mill, Jonathan
Kumari, Meena
Schalkwyk, Leonard C
author_facet Gorrie-Stone, Tyler J
Smart, Melissa C
Saffari, Ayden
Malki, Karim
Hannon, Eilis
Burrage, Joe
Mill, Jonathan
Kumari, Meena
Schalkwyk, Leonard C
author_sort Gorrie-Stone, Tyler J
collection PubMed
description MOTIVATION: The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. RESULTS: Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data. We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. AVAILABILITY AND IMPLEMENTATION: The bigmelon package is available on Bioconductor (http://bioconductor.org/packages/bigmelon/). The Understanding Society dataset is available at https://www.understandingsociety.ac.uk/about/health/data upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6419913
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64199132019-03-20 Bigmelon: tools for analysing large DNA methylation datasets Gorrie-Stone, Tyler J Smart, Melissa C Saffari, Ayden Malki, Karim Hannon, Eilis Burrage, Joe Mill, Jonathan Kumari, Meena Schalkwyk, Leonard C Bioinformatics Original Papers MOTIVATION: The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. RESULTS: Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data. We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. AVAILABILITY AND IMPLEMENTATION: The bigmelon package is available on Bioconductor (http://bioconductor.org/packages/bigmelon/). The Understanding Society dataset is available at https://www.understandingsociety.ac.uk/about/health/data upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-03-15 2018-08-23 /pmc/articles/PMC6419913/ /pubmed/30875430 http://dx.doi.org/10.1093/bioinformatics/bty713 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Gorrie-Stone, Tyler J
Smart, Melissa C
Saffari, Ayden
Malki, Karim
Hannon, Eilis
Burrage, Joe
Mill, Jonathan
Kumari, Meena
Schalkwyk, Leonard C
Bigmelon: tools for analysing large DNA methylation datasets
title Bigmelon: tools for analysing large DNA methylation datasets
title_full Bigmelon: tools for analysing large DNA methylation datasets
title_fullStr Bigmelon: tools for analysing large DNA methylation datasets
title_full_unstemmed Bigmelon: tools for analysing large DNA methylation datasets
title_short Bigmelon: tools for analysing large DNA methylation datasets
title_sort bigmelon: tools for analysing large dna methylation datasets
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419913/
https://www.ncbi.nlm.nih.gov/pubmed/30875430
http://dx.doi.org/10.1093/bioinformatics/bty713
work_keys_str_mv AT gorriestonetylerj bigmelontoolsforanalysinglargednamethylationdatasets
AT smartmelissac bigmelontoolsforanalysinglargednamethylationdatasets
AT saffariayden bigmelontoolsforanalysinglargednamethylationdatasets
AT malkikarim bigmelontoolsforanalysinglargednamethylationdatasets
AT hannoneilis bigmelontoolsforanalysinglargednamethylationdatasets
AT burragejoe bigmelontoolsforanalysinglargednamethylationdatasets
AT milljonathan bigmelontoolsforanalysinglargednamethylationdatasets
AT kumarimeena bigmelontoolsforanalysinglargednamethylationdatasets
AT schalkwykleonardc bigmelontoolsforanalysinglargednamethylationdatasets