Cargando…

A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data

DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of...

Descripción completa

Detalles Bibliográficos
Autores principales: Baur, Brittany, Bozdag, Serdar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4752315/
https://www.ncbi.nlm.nih.gov/pubmed/26872146
http://dx.doi.org/10.1371/journal.pone.0148977
_version_ 1782415709819633664
author Baur, Brittany
Bozdag, Serdar
author_facet Baur, Brittany
Bozdag, Serdar
author_sort Baur, Brittany
collection PubMed
description DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.
format Online
Article
Text
id pubmed-4752315
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-47523152016-02-26 A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data Baur, Brittany Bozdag, Serdar PLoS One Research Article DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes. Public Library of Science 2016-02-12 /pmc/articles/PMC4752315/ /pubmed/26872146 http://dx.doi.org/10.1371/journal.pone.0148977 Text en © 2016 Baur, Bozdag http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Baur, Brittany
Bozdag, Serdar
A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
title A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
title_full A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
title_fullStr A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
title_full_unstemmed A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
title_short A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
title_sort feature selection algorithm to compute gene centric methylation from probe level methylation data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4752315/
https://www.ncbi.nlm.nih.gov/pubmed/26872146
http://dx.doi.org/10.1371/journal.pone.0148977
work_keys_str_mv AT baurbrittany afeatureselectionalgorithmtocomputegenecentricmethylationfromprobelevelmethylationdata
AT bozdagserdar afeatureselectionalgorithmtocomputegenecentricmethylationfromprobelevelmethylationdata
AT baurbrittany featureselectionalgorithmtocomputegenecentricmethylationfromprobelevelmethylationdata
AT bozdagserdar featureselectionalgorithmtocomputegenecentricmethylationfromprobelevelmethylationdata