Cargando…

Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome

BACKGROUND: Regions with abundant GC nucleotides, a high CpG number, and a length greater than 200 bp in a genome are often referred to as CpG islands. These islands are usually located in the 5′ end of genes. Recently, several algorithms for the prediction of CpG islands have been proposed. METHODO...

Descripción completa

Detalles Bibliográficos
Autores principales: Chuang, Li-Yeh, Huang, Hsiu-Chen, Lin, Ming-Cheng, Yang, Cheng-Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3125183/
https://www.ncbi.nlm.nih.gov/pubmed/21738602
http://dx.doi.org/10.1371/journal.pone.0021036
_version_ 1782207175082377216
author Chuang, Li-Yeh
Huang, Hsiu-Chen
Lin, Ming-Cheng
Yang, Cheng-Hong
author_facet Chuang, Li-Yeh
Huang, Hsiu-Chen
Lin, Ming-Cheng
Yang, Cheng-Hong
author_sort Chuang, Li-Yeh
collection PubMed
description BACKGROUND: Regions with abundant GC nucleotides, a high CpG number, and a length greater than 200 bp in a genome are often referred to as CpG islands. These islands are usually located in the 5′ end of genes. Recently, several algorithms for the prediction of CpG islands have been proposed. METHODOLOGY/PRINCIPAL FINDINGS: We propose here a new method called CPSORL to predict CpG islands, which consists of a complement particle swarm optimization algorithm combined with reinforcement learning to predict CpG islands more reliably. Several CpG island prediction tools equipped with the sliding window technique have been developed previously. However, the quality of the results seems to rely too much on the choices that are made for the window sizes, and thus these methods leave room for improvement. CONCLUSIONS/SIGNIFICANCE: Experimental results indicate that CPSORL provides results of a higher sensitivity and a higher correlation coefficient in all selected experimental contigs than the other methods it was compared to (CpGIS, CpGcluster, CpGProd and CpGPlot). A higher number of CpG islands were identified in chromosomes 21 and 22 of the human genome than with the other methods from the literature. CPSORL also achieved the highest coverage rate (3.4%). CPSORL is an application for identifying promoter and TSS regions associated with CpG islands in entire human genomic. When compared to CpGcluster, the islands predicted by CPSORL covered a larger region in the TSS (12.2%) and promoter (26.1%) region. If Alu sequences are considered, the islands predicted by CPSORL (Alu) covered a larger TSS (40.5%) and promoter (67.8%) region than CpGIS. Furthermore, CPSORL was used to verify that the average methylation density was 5.33% for CpG islands in the entire human genome.
format Online
Article
Text
id pubmed-3125183
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31251832011-07-07 Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome Chuang, Li-Yeh Huang, Hsiu-Chen Lin, Ming-Cheng Yang, Cheng-Hong PLoS One Research Article BACKGROUND: Regions with abundant GC nucleotides, a high CpG number, and a length greater than 200 bp in a genome are often referred to as CpG islands. These islands are usually located in the 5′ end of genes. Recently, several algorithms for the prediction of CpG islands have been proposed. METHODOLOGY/PRINCIPAL FINDINGS: We propose here a new method called CPSORL to predict CpG islands, which consists of a complement particle swarm optimization algorithm combined with reinforcement learning to predict CpG islands more reliably. Several CpG island prediction tools equipped with the sliding window technique have been developed previously. However, the quality of the results seems to rely too much on the choices that are made for the window sizes, and thus these methods leave room for improvement. CONCLUSIONS/SIGNIFICANCE: Experimental results indicate that CPSORL provides results of a higher sensitivity and a higher correlation coefficient in all selected experimental contigs than the other methods it was compared to (CpGIS, CpGcluster, CpGProd and CpGPlot). A higher number of CpG islands were identified in chromosomes 21 and 22 of the human genome than with the other methods from the literature. CPSORL also achieved the highest coverage rate (3.4%). CPSORL is an application for identifying promoter and TSS regions associated with CpG islands in entire human genomic. When compared to CpGcluster, the islands predicted by CPSORL covered a larger region in the TSS (12.2%) and promoter (26.1%) region. If Alu sequences are considered, the islands predicted by CPSORL (Alu) covered a larger TSS (40.5%) and promoter (67.8%) region than CpGIS. Furthermore, CPSORL was used to verify that the average methylation density was 5.33% for CpG islands in the entire human genome. Public Library of Science 2011-06-28 /pmc/articles/PMC3125183/ /pubmed/21738602 http://dx.doi.org/10.1371/journal.pone.0021036 Text en Chuang et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Chuang, Li-Yeh
Huang, Hsiu-Chen
Lin, Ming-Cheng
Yang, Cheng-Hong
Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome
title Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome
title_full Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome
title_fullStr Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome
title_full_unstemmed Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome
title_short Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome
title_sort particle swarm optimization with reinforcement learning for the prediction of cpg islands in the human genome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3125183/
https://www.ncbi.nlm.nih.gov/pubmed/21738602
http://dx.doi.org/10.1371/journal.pone.0021036
work_keys_str_mv AT chuangliyeh particleswarmoptimizationwithreinforcementlearningforthepredictionofcpgislandsinthehumangenome
AT huanghsiuchen particleswarmoptimizationwithreinforcementlearningforthepredictionofcpgislandsinthehumangenome
AT linmingcheng particleswarmoptimizationwithreinforcementlearningforthepredictionofcpgislandsinthehumangenome
AT yangchenghong particleswarmoptimizationwithreinforcementlearningforthepredictionofcpgislandsinthehumangenome