Cargando…

Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data

BACKGROUND: Umbilical cord blood (UCB) is commonly used in epigenome-wide association studies of prenatal exposures. Accounting for cell type composition is critical in such studies as it reduces confounding due to the cell specificity of DNA methylation (DNAm). In the absence of cell sorting inform...

Descripción completa

Detalles Bibliográficos
Autores principales: Gervin, Kristina, Salas, Lucas A., Bakulski, Kelly M., van Zelm, Menno C., Koestler, Devin C., Wiencke, John K., Duijts, Liesbeth, Moll, Henriëtte A., Kelsey, Karl T., Kobor, Michael S., Lyle, Robert, Christensen, Brock C., Felix, Janine F., Jones, Meaghan J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6712867/
https://www.ncbi.nlm.nih.gov/pubmed/31455416
http://dx.doi.org/10.1186/s13148-019-0717-y
_version_ 1783446770408226816
author Gervin, Kristina
Salas, Lucas A.
Bakulski, Kelly M.
van Zelm, Menno C.
Koestler, Devin C.
Wiencke, John K.
Duijts, Liesbeth
Moll, Henriëtte A.
Kelsey, Karl T.
Kobor, Michael S.
Lyle, Robert
Christensen, Brock C.
Felix, Janine F.
Jones, Meaghan J.
author_facet Gervin, Kristina
Salas, Lucas A.
Bakulski, Kelly M.
van Zelm, Menno C.
Koestler, Devin C.
Wiencke, John K.
Duijts, Liesbeth
Moll, Henriëtte A.
Kelsey, Karl T.
Kobor, Michael S.
Lyle, Robert
Christensen, Brock C.
Felix, Janine F.
Jones, Meaghan J.
author_sort Gervin, Kristina
collection PubMed
description BACKGROUND: Umbilical cord blood (UCB) is commonly used in epigenome-wide association studies of prenatal exposures. Accounting for cell type composition is critical in such studies as it reduces confounding due to the cell specificity of DNA methylation (DNAm). In the absence of cell sorting information, statistical methods can be applied to deconvolve heterogeneous cell mixtures. Among these methods, reference-based approaches leverage age-appropriate cell-specific DNAm profiles to estimate cellular composition. In UCB, four reference datasets comprising DNAm signatures profiled in purified cell populations have been published using the Illumina 450 K and EPIC arrays. These datasets are biologically and technically different, and currently, there is no consensus on how to best apply them. Here, we systematically evaluate and compare these datasets and provide recommendations for reference-based UCB deconvolution. RESULTS: We first evaluated the four reference datasets to ascertain both the purity of the samples and the potential cell cross-contamination. We filtered samples and combined datasets to obtain a joint UCB reference. We selected deconvolution libraries using two different approaches: automatic selection using the top differentially methylated probes from the function pickCompProbes in minfi and a standardized library selected using the IDOL (Identifying Optimal Libraries) iterative algorithm. We compared the performance of each reference separately and in combination, using the two approaches for reference library selection, and validated the results in an independent cohort (Generation R Study, n = 191) with matched Fluorescence-Activated Cell Sorting measured cell counts. Strict filtering and combination of the references significantly improved the accuracy and efficiency of cell type estimates. Ultimately, the IDOL library outperformed the library from the automatic selection method implemented in pickCompProbes. CONCLUSION: These results have important implications for epigenetic studies in UCB as implementing this method will optimally reduce confounding due to cellular heterogeneity. This work provides guidelines for future reference-based UCB deconvolution and establishes a framework for combining reference datasets in other tissues. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13148-019-0717-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6712867
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-67128672019-09-04 Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data Gervin, Kristina Salas, Lucas A. Bakulski, Kelly M. van Zelm, Menno C. Koestler, Devin C. Wiencke, John K. Duijts, Liesbeth Moll, Henriëtte A. Kelsey, Karl T. Kobor, Michael S. Lyle, Robert Christensen, Brock C. Felix, Janine F. Jones, Meaghan J. Clin Epigenetics Methodology BACKGROUND: Umbilical cord blood (UCB) is commonly used in epigenome-wide association studies of prenatal exposures. Accounting for cell type composition is critical in such studies as it reduces confounding due to the cell specificity of DNA methylation (DNAm). In the absence of cell sorting information, statistical methods can be applied to deconvolve heterogeneous cell mixtures. Among these methods, reference-based approaches leverage age-appropriate cell-specific DNAm profiles to estimate cellular composition. In UCB, four reference datasets comprising DNAm signatures profiled in purified cell populations have been published using the Illumina 450 K and EPIC arrays. These datasets are biologically and technically different, and currently, there is no consensus on how to best apply them. Here, we systematically evaluate and compare these datasets and provide recommendations for reference-based UCB deconvolution. RESULTS: We first evaluated the four reference datasets to ascertain both the purity of the samples and the potential cell cross-contamination. We filtered samples and combined datasets to obtain a joint UCB reference. We selected deconvolution libraries using two different approaches: automatic selection using the top differentially methylated probes from the function pickCompProbes in minfi and a standardized library selected using the IDOL (Identifying Optimal Libraries) iterative algorithm. We compared the performance of each reference separately and in combination, using the two approaches for reference library selection, and validated the results in an independent cohort (Generation R Study, n = 191) with matched Fluorescence-Activated Cell Sorting measured cell counts. Strict filtering and combination of the references significantly improved the accuracy and efficiency of cell type estimates. Ultimately, the IDOL library outperformed the library from the automatic selection method implemented in pickCompProbes. CONCLUSION: These results have important implications for epigenetic studies in UCB as implementing this method will optimally reduce confounding due to cellular heterogeneity. This work provides guidelines for future reference-based UCB deconvolution and establishes a framework for combining reference datasets in other tissues. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13148-019-0717-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-08-27 /pmc/articles/PMC6712867/ /pubmed/31455416 http://dx.doi.org/10.1186/s13148-019-0717-y Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Gervin, Kristina
Salas, Lucas A.
Bakulski, Kelly M.
van Zelm, Menno C.
Koestler, Devin C.
Wiencke, John K.
Duijts, Liesbeth
Moll, Henriëtte A.
Kelsey, Karl T.
Kobor, Michael S.
Lyle, Robert
Christensen, Brock C.
Felix, Janine F.
Jones, Meaghan J.
Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
title Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
title_full Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
title_fullStr Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
title_full_unstemmed Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
title_short Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
title_sort systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood dna methylation data
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6712867/
https://www.ncbi.nlm.nih.gov/pubmed/31455416
http://dx.doi.org/10.1186/s13148-019-0717-y
work_keys_str_mv AT gervinkristina systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT salaslucasa systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT bakulskikellym systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT vanzelmmennoc systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT koestlerdevinc systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT wienckejohnk systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT duijtsliesbeth systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT mollhenriettea systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT kelseykarlt systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT kobormichaels systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT lylerobert systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT christensenbrockc systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT felixjaninef systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata
AT jonesmeaghanj systematicevaluationandvalidationofreferenceandlibraryselectionmethodsfordeconvolutionofcordblooddnamethylationdata