Cargando…

Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT

OBJECTIVE: A wide array of methods exist for processing and analysing DNA methylation data. We aimed to perform a systematic comparison of the behaviour of these methods, using cord blood DNAm from the LIMIT RCT, in relation to detecting hypothesised effects of interest (intervention and pre-pregnan...

Descripción completa

Detalles Bibliográficos
Autores principales: Louise, Jennie, Deussen, Andrea R., Dodd, Jodie M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901304/
https://www.ncbi.nlm.nih.gov/pubmed/36755865
http://dx.doi.org/10.7717/peerj.14786
_version_ 1784883005269999616
author Louise, Jennie
Deussen, Andrea R.
Dodd, Jodie M.
author_facet Louise, Jennie
Deussen, Andrea R.
Dodd, Jodie M.
author_sort Louise, Jennie
collection PubMed
description OBJECTIVE: A wide array of methods exist for processing and analysing DNA methylation data. We aimed to perform a systematic comparison of the behaviour of these methods, using cord blood DNAm from the LIMIT RCT, in relation to detecting hypothesised effects of interest (intervention and pre-pregnancy maternal BMI) as well as effects known to be spurious, and known to be present. METHODS: DNAm data, from 645 cord blood samples analysed using Illumina 450K BeadChip arrays, were normalised using three different methods (with probe filtering undertaken pre- or post- normalisation). Batch effects were handled with a supervised algorithm, an unsupervised algorithm, or adjustment in the analysis model. Analysis was undertaken with and without adjustment for estimated cell type proportions. The effects estimated included intervention and BMI (effects of interest in the original study), infant sex and randomly assigned groups. Data processing and analysis methods were compared in relation to number and identity of differentially methylated probes, rankings of probes by p value and log-fold-change, and distributions of p values and log-fold-change estimates. RESULTS: There were differences corresponding to each of the processing and analysis choices. Importantly, some combinations of data processing choices resulted in a substantial number of spurious ‘significant’ findings. We recommend greater emphasis on replication and greater use of sensitivity analyses.
format Online
Article
Text
id pubmed-9901304
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-99013042023-02-07 Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT Louise, Jennie Deussen, Andrea R. Dodd, Jodie M. PeerJ Bioinformatics OBJECTIVE: A wide array of methods exist for processing and analysing DNA methylation data. We aimed to perform a systematic comparison of the behaviour of these methods, using cord blood DNAm from the LIMIT RCT, in relation to detecting hypothesised effects of interest (intervention and pre-pregnancy maternal BMI) as well as effects known to be spurious, and known to be present. METHODS: DNAm data, from 645 cord blood samples analysed using Illumina 450K BeadChip arrays, were normalised using three different methods (with probe filtering undertaken pre- or post- normalisation). Batch effects were handled with a supervised algorithm, an unsupervised algorithm, or adjustment in the analysis model. Analysis was undertaken with and without adjustment for estimated cell type proportions. The effects estimated included intervention and BMI (effects of interest in the original study), infant sex and randomly assigned groups. Data processing and analysis methods were compared in relation to number and identity of differentially methylated probes, rankings of probes by p value and log-fold-change, and distributions of p values and log-fold-change estimates. RESULTS: There were differences corresponding to each of the processing and analysis choices. Importantly, some combinations of data processing choices resulted in a substantial number of spurious ‘significant’ findings. We recommend greater emphasis on replication and greater use of sensitivity analyses. PeerJ Inc. 2023-02-03 /pmc/articles/PMC9901304/ /pubmed/36755865 http://dx.doi.org/10.7717/peerj.14786 Text en ©2023 Louise et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Louise, Jennie
Deussen, Andrea R.
Dodd, Jodie M.
Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT
title Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT
title_full Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT
title_fullStr Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT
title_full_unstemmed Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT
title_short Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT
title_sort data processing choices can affect findings in differential methylation analyses: an investigation using data from the limit rct
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901304/
https://www.ncbi.nlm.nih.gov/pubmed/36755865
http://dx.doi.org/10.7717/peerj.14786
work_keys_str_mv AT louisejennie dataprocessingchoicescanaffectfindingsindifferentialmethylationanalysesaninvestigationusingdatafromthelimitrct
AT deussenandrear dataprocessingchoicescanaffectfindingsindifferentialmethylationanalysesaninvestigationusingdatafromthelimitrct
AT doddjodiem dataprocessingchoicescanaffectfindingsindifferentialmethylationanalysesaninvestigationusingdatafromthelimitrct