Cargando…

Identifying and correcting epigenetics measurements for systematic sources of variation

BACKGROUND: Methylation measures quantified by microarray techniques can be affected by systematic variation due to the technical processing of samples, which may compromise the accuracy of the measurement process and contribute to bias the estimate of the association under investigation. The quanti...

Descripción completa

Detalles Bibliográficos
Autores principales: Perrier, Flavie, Novoloaca, Alexei, Ambatipudi, Srikant, Baglietto, Laura, Ghantous, Akram, Perduca, Vittorio, Barrdahl, Myrto, Harlid, Sophia, Ong, Ken K., Cardona, Alexia, Polidoro, Silvia, Nøst, Therese Haugdahl, Overvad, Kim, Omichessan, Hanane, Dollé, Martijn, Bamia, Christina, Huerta, José Marìa, Vineis, Paolo, Herceg, Zdenko, Romieu, Isabelle, Ferrari, Pietro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5863487/
https://www.ncbi.nlm.nih.gov/pubmed/29588806
http://dx.doi.org/10.1186/s13148-018-0471-6
_version_ 1783308400970432512
author Perrier, Flavie
Novoloaca, Alexei
Ambatipudi, Srikant
Baglietto, Laura
Ghantous, Akram
Perduca, Vittorio
Barrdahl, Myrto
Harlid, Sophia
Ong, Ken K.
Cardona, Alexia
Polidoro, Silvia
Nøst, Therese Haugdahl
Overvad, Kim
Omichessan, Hanane
Dollé, Martijn
Bamia, Christina
Huerta, José Marìa
Vineis, Paolo
Herceg, Zdenko
Romieu, Isabelle
Ferrari, Pietro
author_facet Perrier, Flavie
Novoloaca, Alexei
Ambatipudi, Srikant
Baglietto, Laura
Ghantous, Akram
Perduca, Vittorio
Barrdahl, Myrto
Harlid, Sophia
Ong, Ken K.
Cardona, Alexia
Polidoro, Silvia
Nøst, Therese Haugdahl
Overvad, Kim
Omichessan, Hanane
Dollé, Martijn
Bamia, Christina
Huerta, José Marìa
Vineis, Paolo
Herceg, Zdenko
Romieu, Isabelle
Ferrari, Pietro
author_sort Perrier, Flavie
collection PubMed
description BACKGROUND: Methylation measures quantified by microarray techniques can be affected by systematic variation due to the technical processing of samples, which may compromise the accuracy of the measurement process and contribute to bias the estimate of the association under investigation. The quantification of the contribution of the systematic source of variation is challenging in datasets characterized by hundreds of thousands of features. In this study, we introduce a method previously developed for the analysis of metabolomics data to evaluate the performance of existing normalizing techniques to correct for unwanted variation. Illumina Infinium HumanMethylation450K was used to acquire methylation levels in over 421,000 CpG sites for 902 study participants of a case-control study on breast cancer nested within the EPIC cohort. The principal component partial R-square (PC-PR2) analysis was used to identify and quantify the variability attributable to potential systematic sources of variation. Three correcting techniques, namely ComBat, surrogate variables analysis (SVA) and a linear regression model to compute residuals were applied. The impact of each correcting method on the association between smoking status and DNA methylation levels was evaluated, and results were compared with findings from a large meta-analysis. RESULTS: A sizeable proportion of systematic variability due to variables expressing ‘batch’ and ‘sample position’ within ‘chip’ was identified, with values of the partial R(2) statistics equal to 9.5 and 11.4% of total variation, respectively. After application of ComBat or the residuals’ methods, the contribution was 1.3 and 0.2%, respectively. The SVA technique resulted in a reduced variability due to ‘batch’ (1.3%) and ‘sample position’ (0.6%), and in a diminished variability attributable to ‘chip’ within a batch (0.9%). After ComBat or the residuals’ corrections, a larger number of significant sites (k = 600 and k = 427, respectively) were associated to smoking status than the SVA correction (k = 96). CONCLUSIONS: The three correction methods removed systematic variation in DNA methylation data, as assessed by the PC-PR2, which lent itself as a useful tool to explore variability in large dimension data. SVA produced more conservative findings than ComBat in the association between smoking and DNA methylation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13148-018-0471-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5863487
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58634872018-03-27 Identifying and correcting epigenetics measurements for systematic sources of variation Perrier, Flavie Novoloaca, Alexei Ambatipudi, Srikant Baglietto, Laura Ghantous, Akram Perduca, Vittorio Barrdahl, Myrto Harlid, Sophia Ong, Ken K. Cardona, Alexia Polidoro, Silvia Nøst, Therese Haugdahl Overvad, Kim Omichessan, Hanane Dollé, Martijn Bamia, Christina Huerta, José Marìa Vineis, Paolo Herceg, Zdenko Romieu, Isabelle Ferrari, Pietro Clin Epigenetics Methodology BACKGROUND: Methylation measures quantified by microarray techniques can be affected by systematic variation due to the technical processing of samples, which may compromise the accuracy of the measurement process and contribute to bias the estimate of the association under investigation. The quantification of the contribution of the systematic source of variation is challenging in datasets characterized by hundreds of thousands of features. In this study, we introduce a method previously developed for the analysis of metabolomics data to evaluate the performance of existing normalizing techniques to correct for unwanted variation. Illumina Infinium HumanMethylation450K was used to acquire methylation levels in over 421,000 CpG sites for 902 study participants of a case-control study on breast cancer nested within the EPIC cohort. The principal component partial R-square (PC-PR2) analysis was used to identify and quantify the variability attributable to potential systematic sources of variation. Three correcting techniques, namely ComBat, surrogate variables analysis (SVA) and a linear regression model to compute residuals were applied. The impact of each correcting method on the association between smoking status and DNA methylation levels was evaluated, and results were compared with findings from a large meta-analysis. RESULTS: A sizeable proportion of systematic variability due to variables expressing ‘batch’ and ‘sample position’ within ‘chip’ was identified, with values of the partial R(2) statistics equal to 9.5 and 11.4% of total variation, respectively. After application of ComBat or the residuals’ methods, the contribution was 1.3 and 0.2%, respectively. The SVA technique resulted in a reduced variability due to ‘batch’ (1.3%) and ‘sample position’ (0.6%), and in a diminished variability attributable to ‘chip’ within a batch (0.9%). After ComBat or the residuals’ corrections, a larger number of significant sites (k = 600 and k = 427, respectively) were associated to smoking status than the SVA correction (k = 96). CONCLUSIONS: The three correction methods removed systematic variation in DNA methylation data, as assessed by the PC-PR2, which lent itself as a useful tool to explore variability in large dimension data. SVA produced more conservative findings than ComBat in the association between smoking and DNA methylation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13148-018-0471-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-21 /pmc/articles/PMC5863487/ /pubmed/29588806 http://dx.doi.org/10.1186/s13148-018-0471-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Perrier, Flavie
Novoloaca, Alexei
Ambatipudi, Srikant
Baglietto, Laura
Ghantous, Akram
Perduca, Vittorio
Barrdahl, Myrto
Harlid, Sophia
Ong, Ken K.
Cardona, Alexia
Polidoro, Silvia
Nøst, Therese Haugdahl
Overvad, Kim
Omichessan, Hanane
Dollé, Martijn
Bamia, Christina
Huerta, José Marìa
Vineis, Paolo
Herceg, Zdenko
Romieu, Isabelle
Ferrari, Pietro
Identifying and correcting epigenetics measurements for systematic sources of variation
title Identifying and correcting epigenetics measurements for systematic sources of variation
title_full Identifying and correcting epigenetics measurements for systematic sources of variation
title_fullStr Identifying and correcting epigenetics measurements for systematic sources of variation
title_full_unstemmed Identifying and correcting epigenetics measurements for systematic sources of variation
title_short Identifying and correcting epigenetics measurements for systematic sources of variation
title_sort identifying and correcting epigenetics measurements for systematic sources of variation
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5863487/
https://www.ncbi.nlm.nih.gov/pubmed/29588806
http://dx.doi.org/10.1186/s13148-018-0471-6
work_keys_str_mv AT perrierflavie identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT novoloacaalexei identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT ambatipudisrikant identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT bagliettolaura identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT ghantousakram identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT perducavittorio identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT barrdahlmyrto identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT harlidsophia identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT ongkenk identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT cardonaalexia identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT polidorosilvia identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT nøsttheresehaugdahl identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT overvadkim identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT omichessanhanane identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT dollemartijn identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT bamiachristina identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT huertajosemaria identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT vineispaolo identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT hercegzdenko identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT romieuisabelle identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation
AT ferraripietro identifyingandcorrectingepigeneticsmeasurementsforsystematicsourcesofvariation