Cargando…

A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data

DNA methylation plays an important role in disease etiology. The Illumina Infinium HumanMethylation450 (450K) BeadChip is a widely used platform in large-scale epidemiologic studies. This platform can efficiently and simultaneously measure methylation levels at ∼480,000 CpG sites in the human genome...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Ting, Guan, Weihua, Lin, Jerome, Boutaoui, Nadia, Canino, Glorisa, Luo, Jianhua, Celedón, Juan Carlos, Chen, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Taylor & Francis 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4623491/
https://www.ncbi.nlm.nih.gov/pubmed/26036609
http://dx.doi.org/10.1080/15592294.2015.1057384
_version_ 1782397690778222592
author Wang, Ting
Guan, Weihua
Lin, Jerome
Boutaoui, Nadia
Canino, Glorisa
Luo, Jianhua
Celedón, Juan Carlos
Chen, Wei
author_facet Wang, Ting
Guan, Weihua
Lin, Jerome
Boutaoui, Nadia
Canino, Glorisa
Luo, Jianhua
Celedón, Juan Carlos
Chen, Wei
author_sort Wang, Ting
collection PubMed
description DNA methylation plays an important role in disease etiology. The Illumina Infinium HumanMethylation450 (450K) BeadChip is a widely used platform in large-scale epidemiologic studies. This platform can efficiently and simultaneously measure methylation levels at ∼480,000 CpG sites in the human genome in multiple study samples. Due to the intrinsic chip design of 2 types of chemistry probes, data normalization or preprocessing is a critical step to consider before data analysis. To date, numerous methods and pipelines have been developed for this purpose, and some studies have been conducted to evaluate different methods. However, validation studies have often been limited to a small number of CpG sites to reduce the variability in technical replicates. In this study, we measured methylation on a set of samples using both whole-genome bisulfite sequencing (WGBS) and 450K chips. We used WGBS data as a gold standard of true methylation states in cells to compare the performances of 8 normalization methods for 450K data on a genome-wide scale. Analyses on our dataset indicate that the most effective methods are peak-based correction (PBC) and quantile normalization plus β-mixture quantile normalization (QN.BMIQ). To our knowledge, this is the first study to systematically compare existing normalization methods for Illumina 450K data using novel WGBS data. Our results provide a benchmark reference for the analysis of DNA methylation chip data, particularly in white blood cells.
format Online
Article
Text
id pubmed-4623491
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Taylor & Francis
record_format MEDLINE/PubMed
spelling pubmed-46234912016-02-03 A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data Wang, Ting Guan, Weihua Lin, Jerome Boutaoui, Nadia Canino, Glorisa Luo, Jianhua Celedón, Juan Carlos Chen, Wei Epigenetics Research Paper DNA methylation plays an important role in disease etiology. The Illumina Infinium HumanMethylation450 (450K) BeadChip is a widely used platform in large-scale epidemiologic studies. This platform can efficiently and simultaneously measure methylation levels at ∼480,000 CpG sites in the human genome in multiple study samples. Due to the intrinsic chip design of 2 types of chemistry probes, data normalization or preprocessing is a critical step to consider before data analysis. To date, numerous methods and pipelines have been developed for this purpose, and some studies have been conducted to evaluate different methods. However, validation studies have often been limited to a small number of CpG sites to reduce the variability in technical replicates. In this study, we measured methylation on a set of samples using both whole-genome bisulfite sequencing (WGBS) and 450K chips. We used WGBS data as a gold standard of true methylation states in cells to compare the performances of 8 normalization methods for 450K data on a genome-wide scale. Analyses on our dataset indicate that the most effective methods are peak-based correction (PBC) and quantile normalization plus β-mixture quantile normalization (QN.BMIQ). To our knowledge, this is the first study to systematically compare existing normalization methods for Illumina 450K data using novel WGBS data. Our results provide a benchmark reference for the analysis of DNA methylation chip data, particularly in white blood cells. Taylor & Francis 2015-06-02 /pmc/articles/PMC4623491/ /pubmed/26036609 http://dx.doi.org/10.1080/15592294.2015.1057384 Text en © 2015 The Author(s). Published with license by Taylor & Francis Group, LLC http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted.
spellingShingle Research Paper
Wang, Ting
Guan, Weihua
Lin, Jerome
Boutaoui, Nadia
Canino, Glorisa
Luo, Jianhua
Celedón, Juan Carlos
Chen, Wei
A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
title A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
title_full A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
title_fullStr A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
title_full_unstemmed A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
title_short A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
title_sort systematic study of normalization methods for infinium 450k methylation data using whole-genome bisulfite sequencing data
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4623491/
https://www.ncbi.nlm.nih.gov/pubmed/26036609
http://dx.doi.org/10.1080/15592294.2015.1057384
work_keys_str_mv AT wangting asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT guanweihua asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT linjerome asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT boutaouinadia asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT caninoglorisa asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT luojianhua asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT celedonjuancarlos asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT chenwei asystematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT wangting systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT guanweihua systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT linjerome systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT boutaouinadia systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT caninoglorisa systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT luojianhua systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT celedonjuancarlos systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata
AT chenwei systematicstudyofnormalizationmethodsforinfinium450kmethylationdatausingwholegenomebisulfitesequencingdata