Cargando…

A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling

BACKGROUND: High-throughput DNA methylation arrays are likely to accelerate the pace of methylation biomarker discovery for a wide variety of diseases. A potential problem with a standard set of probes measuring the methylation status of CpG sites across the whole genome is that many sites may not s...

Descripción completa

Detalles Bibliográficos
Autores principales: Meng, Hailong, Joyce, Andrew R, Adkins, Daniel E, Basu, Priyadarshi, Jia, Yankai, Li, Guoya, Sengupta, Tapas K, Zedler, Barbara K, Murrelle, E Lenn, van den Oord, Edwin JCG
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2876131/
https://www.ncbi.nlm.nih.gov/pubmed/20441598
http://dx.doi.org/10.1186/1471-2105-11-227
_version_ 1782181668525703168
author Meng, Hailong
Joyce, Andrew R
Adkins, Daniel E
Basu, Priyadarshi
Jia, Yankai
Li, Guoya
Sengupta, Tapas K
Zedler, Barbara K
Murrelle, E Lenn
van den Oord, Edwin JCG
author_facet Meng, Hailong
Joyce, Andrew R
Adkins, Daniel E
Basu, Priyadarshi
Jia, Yankai
Li, Guoya
Sengupta, Tapas K
Zedler, Barbara K
Murrelle, E Lenn
van den Oord, Edwin JCG
author_sort Meng, Hailong
collection PubMed
description BACKGROUND: High-throughput DNA methylation arrays are likely to accelerate the pace of methylation biomarker discovery for a wide variety of diseases. A potential problem with a standard set of probes measuring the methylation status of CpG sites across the whole genome is that many sites may not show inter-individual methylation variation among the biosamples for the disease outcome being studied. Inclusion of these so-called "non-variable sites" will increase the risk of false discoveries and reduce statistical power to detect biologically relevant methylation markers. RESULTS: We propose a method to estimate the proportion of non-variable CpG sites and eliminate those sites from further analyses. Our method is illustrated using data obtained by hybridizing DNA extracted from the peripheral blood mononuclear cells of 311 samples to an array assaying 1505 CpG sites. Results showed that a large proportion of the CpG sites did not show inter-individual variation in methylation. CONCLUSIONS: Our method resulted in a substantial improvement in association signals between methylation sites and outcome variables while controlling the false discovery rate at the same level.
format Text
id pubmed-2876131
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28761312010-05-26 A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling Meng, Hailong Joyce, Andrew R Adkins, Daniel E Basu, Priyadarshi Jia, Yankai Li, Guoya Sengupta, Tapas K Zedler, Barbara K Murrelle, E Lenn van den Oord, Edwin JCG BMC Bioinformatics Methodology article BACKGROUND: High-throughput DNA methylation arrays are likely to accelerate the pace of methylation biomarker discovery for a wide variety of diseases. A potential problem with a standard set of probes measuring the methylation status of CpG sites across the whole genome is that many sites may not show inter-individual methylation variation among the biosamples for the disease outcome being studied. Inclusion of these so-called "non-variable sites" will increase the risk of false discoveries and reduce statistical power to detect biologically relevant methylation markers. RESULTS: We propose a method to estimate the proportion of non-variable CpG sites and eliminate those sites from further analyses. Our method is illustrated using data obtained by hybridizing DNA extracted from the peripheral blood mononuclear cells of 311 samples to an array assaying 1505 CpG sites. Results showed that a large proportion of the CpG sites did not show inter-individual variation in methylation. CONCLUSIONS: Our method resulted in a substantial improvement in association signals between methylation sites and outcome variables while controlling the false discovery rate at the same level. BioMed Central 2010-05-05 /pmc/articles/PMC2876131/ /pubmed/20441598 http://dx.doi.org/10.1186/1471-2105-11-227 Text en Copyright ©2010 Meng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology article
Meng, Hailong
Joyce, Andrew R
Adkins, Daniel E
Basu, Priyadarshi
Jia, Yankai
Li, Guoya
Sengupta, Tapas K
Zedler, Barbara K
Murrelle, E Lenn
van den Oord, Edwin JCG
A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling
title A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling
title_full A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling
title_fullStr A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling
title_full_unstemmed A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling
title_short A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling
title_sort statistical method for excluding non-variable cpg sites in high-throughput dna methylation profiling
topic Methodology article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2876131/
https://www.ncbi.nlm.nih.gov/pubmed/20441598
http://dx.doi.org/10.1186/1471-2105-11-227
work_keys_str_mv AT menghailong astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT joyceandrewr astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT adkinsdaniele astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT basupriyadarshi astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT jiayankai astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT liguoya astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT senguptatapask astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT zedlerbarbarak astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT murrelleelenn astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT vandenoordedwinjcg astatisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT menghailong statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT joyceandrewr statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT adkinsdaniele statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT basupriyadarshi statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT jiayankai statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT liguoya statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT senguptatapask statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT zedlerbarbarak statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT murrelleelenn statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling
AT vandenoordedwinjcg statisticalmethodforexcludingnonvariablecpgsitesinhighthroughputdnamethylationprofiling