Cargando…

Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses

BACKGROUND: Human methylome mapping in health and disease states has largely relied on Illumina Human Methylation 450k array (450k array) technology. Accompanying this has been the necessary evolution of analysis pipelines to facilitate data processing. The majority of these pipelines, however, cate...

Descripción completa

Detalles Bibliográficos
Autores principales: Cazaly, Emma, Thomson, Russell, Marthick, James R., Holloway, Adele F., Charlesworth, Jac, Dickinson, Joanne L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4947255/
https://www.ncbi.nlm.nih.gov/pubmed/27429663
http://dx.doi.org/10.1186/s13148-016-0241-2
_version_ 1782443141837619200
author Cazaly, Emma
Thomson, Russell
Marthick, James R.
Holloway, Adele F.
Charlesworth, Jac
Dickinson, Joanne L.
author_facet Cazaly, Emma
Thomson, Russell
Marthick, James R.
Holloway, Adele F.
Charlesworth, Jac
Dickinson, Joanne L.
author_sort Cazaly, Emma
collection PubMed
description BACKGROUND: Human methylome mapping in health and disease states has largely relied on Illumina Human Methylation 450k array (450k array) technology. Accompanying this has been the necessary evolution of analysis pipelines to facilitate data processing. The majority of these pipelines, however, cater for experimental designs where matched ‘controls’ or ‘normal’ samples are available. Experimental designs where no appropriate ‘reference’ exists remain challenging. Herein, we use data generated from our study of the inheritance of methylome profiles in families to evaluate the performance of eight normalisation pre-processing methods. Fifty individual samples representing four families were interrogated on five 450k array BeadChips. Eight normalisation methods were tested using qualitative and quantitative metrics, to assess efficacy and suitability. RESULTS: Stratified quantile normalisation combined with ComBat were consistently found to be the most appropriate when assessed using density, MDS and cluster plots. This was supported quantitatively by ANOVA on the first principal component where the effect of batch dropped from p < 0.01 to p = 0.97 after stratified QN and ComBat. Median absolute differences between replicated samples were the lowest after stratified QN and ComBat as were the standard error measures on known imprinted regions. Biological information was preserved after normalisation as indicated by the maintenance of a significant association between a known mQTL and methylation (p = 1.05e-05). CONCLUSIONS: A strategy combining stratified QN with ComBat is appropriate for use in the analyses when no reference sample is available but preservation of biological variation is paramount. There is great potential for use of 450k array data to further our understanding of the methylome in a variety of similar settings. Such advances will be reliant on the determination of appropriate methodologies for processing these data such as established here. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13148-016-0241-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4947255
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49472552016-07-17 Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses Cazaly, Emma Thomson, Russell Marthick, James R. Holloway, Adele F. Charlesworth, Jac Dickinson, Joanne L. Clin Epigenetics Methodology BACKGROUND: Human methylome mapping in health and disease states has largely relied on Illumina Human Methylation 450k array (450k array) technology. Accompanying this has been the necessary evolution of analysis pipelines to facilitate data processing. The majority of these pipelines, however, cater for experimental designs where matched ‘controls’ or ‘normal’ samples are available. Experimental designs where no appropriate ‘reference’ exists remain challenging. Herein, we use data generated from our study of the inheritance of methylome profiles in families to evaluate the performance of eight normalisation pre-processing methods. Fifty individual samples representing four families were interrogated on five 450k array BeadChips. Eight normalisation methods were tested using qualitative and quantitative metrics, to assess efficacy and suitability. RESULTS: Stratified quantile normalisation combined with ComBat were consistently found to be the most appropriate when assessed using density, MDS and cluster plots. This was supported quantitatively by ANOVA on the first principal component where the effect of batch dropped from p < 0.01 to p = 0.97 after stratified QN and ComBat. Median absolute differences between replicated samples were the lowest after stratified QN and ComBat as were the standard error measures on known imprinted regions. Biological information was preserved after normalisation as indicated by the maintenance of a significant association between a known mQTL and methylation (p = 1.05e-05). CONCLUSIONS: A strategy combining stratified QN with ComBat is appropriate for use in the analyses when no reference sample is available but preservation of biological variation is paramount. There is great potential for use of 450k array data to further our understanding of the methylome in a variety of similar settings. Such advances will be reliant on the determination of appropriate methodologies for processing these data such as established here. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13148-016-0241-2) contains supplementary material, which is available to authorized users. BioMed Central 2016-07-16 /pmc/articles/PMC4947255/ /pubmed/27429663 http://dx.doi.org/10.1186/s13148-016-0241-2 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Cazaly, Emma
Thomson, Russell
Marthick, James R.
Holloway, Adele F.
Charlesworth, Jac
Dickinson, Joanne L.
Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses
title Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses
title_full Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses
title_fullStr Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses
title_full_unstemmed Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses
title_short Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses
title_sort comparison of pre-processing methodologies for illumina 450k methylation array data in familial analyses
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4947255/
https://www.ncbi.nlm.nih.gov/pubmed/27429663
http://dx.doi.org/10.1186/s13148-016-0241-2
work_keys_str_mv AT cazalyemma comparisonofpreprocessingmethodologiesforillumina450kmethylationarraydatainfamilialanalyses
AT thomsonrussell comparisonofpreprocessingmethodologiesforillumina450kmethylationarraydatainfamilialanalyses
AT marthickjamesr comparisonofpreprocessingmethodologiesforillumina450kmethylationarraydatainfamilialanalyses
AT hollowayadelef comparisonofpreprocessingmethodologiesforillumina450kmethylationarraydatainfamilialanalyses
AT charlesworthjac comparisonofpreprocessingmethodologiesforillumina450kmethylationarraydatainfamilialanalyses
AT dickinsonjoannel comparisonofpreprocessingmethodologiesforillumina450kmethylationarraydatainfamilialanalyses