Cargando…
Identification of CpG islands in DNA sequences using statistically optimal null filters
CpG dinucleotide clusters also referred to as CpG islands (CGIs) are usually located in the promoter regions of genes in a deoxyribonucleic acid (DNA) sequence. CGIs play a crucial role in gene expression and cell differentiation, as such, they are normally used as gene markers. The earlier CGI iden...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3570435/ https://www.ncbi.nlm.nih.gov/pubmed/22931396 http://dx.doi.org/10.1186/1687-4153-2012-12 |
_version_ | 1782259074220425216 |
---|---|
author | Kakumani, Rajasekhar Ahmad, Omair Devabhaktuni, Vijay |
author_facet | Kakumani, Rajasekhar Ahmad, Omair Devabhaktuni, Vijay |
author_sort | Kakumani, Rajasekhar |
collection | PubMed |
description | CpG dinucleotide clusters also referred to as CpG islands (CGIs) are usually located in the promoter regions of genes in a deoxyribonucleic acid (DNA) sequence. CGIs play a crucial role in gene expression and cell differentiation, as such, they are normally used as gene markers. The earlier CGI identification methods used the rich CpG dinucleotide content in CGIs, as a characteristic measure to identify the locations of CGIs. The fact, that the probability of nucleotide G following nucleotide C in a CGI is greater as compared to a non-CGI, is employed by some of the recent methods. These methods use the difference in transition probabilities between subsequent nucleotides to distinguish between a CGI from a non-CGI. These transition probabilities vary with the data being analyzed and several of them have been reported in the literature sometimes leading to contradictory results. In this article, we propose a new and efficient scheme for identification of CGIs using statistically optimal null filters. We formulate a new CGI identification characteristic to reliably and efficiently identify CGIs in a given DNA sequence which is devoid of any ambiguities. Our proposed scheme combines maximum signal-to-noise ratio and least squares optimization criteria to estimate the CGI identification characteristic in the DNA sequence. The proposed scheme is tested on a number of DNA sequences taken from human chromosomes 21 and 22, and proved to be highly reliable as well as efficient in identifying the CGIs. |
format | Online Article Text |
id | pubmed-3570435 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35704352013-02-14 Identification of CpG islands in DNA sequences using statistically optimal null filters Kakumani, Rajasekhar Ahmad, Omair Devabhaktuni, Vijay EURASIP J Bioinform Syst Biol Research CpG dinucleotide clusters also referred to as CpG islands (CGIs) are usually located in the promoter regions of genes in a deoxyribonucleic acid (DNA) sequence. CGIs play a crucial role in gene expression and cell differentiation, as such, they are normally used as gene markers. The earlier CGI identification methods used the rich CpG dinucleotide content in CGIs, as a characteristic measure to identify the locations of CGIs. The fact, that the probability of nucleotide G following nucleotide C in a CGI is greater as compared to a non-CGI, is employed by some of the recent methods. These methods use the difference in transition probabilities between subsequent nucleotides to distinguish between a CGI from a non-CGI. These transition probabilities vary with the data being analyzed and several of them have been reported in the literature sometimes leading to contradictory results. In this article, we propose a new and efficient scheme for identification of CGIs using statistically optimal null filters. We formulate a new CGI identification characteristic to reliably and efficiently identify CGIs in a given DNA sequence which is devoid of any ambiguities. Our proposed scheme combines maximum signal-to-noise ratio and least squares optimization criteria to estimate the CGI identification characteristic in the DNA sequence. The proposed scheme is tested on a number of DNA sequences taken from human chromosomes 21 and 22, and proved to be highly reliable as well as efficient in identifying the CGIs. BioMed Central 2012 2012-08-29 /pmc/articles/PMC3570435/ /pubmed/22931396 http://dx.doi.org/10.1186/1687-4153-2012-12 Text en Copyright ©2012 Kakumani et al.; licensee Springer. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Kakumani, Rajasekhar Ahmad, Omair Devabhaktuni, Vijay Identification of CpG islands in DNA sequences using statistically optimal null filters |
title | Identification of CpG islands in DNA sequences using statistically optimal null filters |
title_full | Identification of CpG islands in DNA sequences using statistically optimal null filters |
title_fullStr | Identification of CpG islands in DNA sequences using statistically optimal null filters |
title_full_unstemmed | Identification of CpG islands in DNA sequences using statistically optimal null filters |
title_short | Identification of CpG islands in DNA sequences using statistically optimal null filters |
title_sort | identification of cpg islands in dna sequences using statistically optimal null filters |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3570435/ https://www.ncbi.nlm.nih.gov/pubmed/22931396 http://dx.doi.org/10.1186/1687-4153-2012-12 |
work_keys_str_mv | AT kakumanirajasekhar identificationofcpgislandsindnasequencesusingstatisticallyoptimalnullfilters AT ahmadomair identificationofcpgislandsindnasequencesusingstatisticallyoptimalnullfilters AT devabhaktunivijay identificationofcpgislandsindnasequencesusingstatisticallyoptimalnullfilters |