Cargando…

Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics

BACKGROUND: Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentati...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Suping, Shi, Yixiang, Yuan, Liyun, Li, Yixue, Ding, Guohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3535712/
https://www.ncbi.nlm.nih.gov/pubmed/23282225
http://dx.doi.org/10.1186/1471-2164-13-S8-S19
_version_ 1782254702795161600
author Deng, Suping
Shi, Yixiang
Yuan, Liyun
Li, Yixue
Ding, Guohui
author_facet Deng, Suping
Shi, Yixiang
Yuan, Liyun
Li, Yixue
Ding, Guohui
author_sort Deng, Suping
collection PubMed
description BACKGROUND: Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. METHODS: In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. RESULTS: Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. CONCLUSIONS: This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences.
format Online
Article
Text
id pubmed-3535712
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35357122013-01-04 Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics Deng, Suping Shi, Yixiang Yuan, Liyun Li, Yixue Ding, Guohui BMC Genomics Research BACKGROUND: Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. METHODS: In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. RESULTS: Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. CONCLUSIONS: This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. BioMed Central 2012-12-17 /pmc/articles/PMC3535712/ /pubmed/23282225 http://dx.doi.org/10.1186/1471-2164-13-S8-S19 Text en Copyright ©2012 Deng et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Deng, Suping
Shi, Yixiang
Yuan, Liyun
Li, Yixue
Ding, Guohui
Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_full Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_fullStr Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_full_unstemmed Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_short Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_sort detecting the borders between coding and non-coding dna regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3535712/
https://www.ncbi.nlm.nih.gov/pubmed/23282225
http://dx.doi.org/10.1186/1471-2164-13-S8-S19
work_keys_str_mv AT dengsuping detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT shiyixiang detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT yuanliyun detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT liyixue detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT dingguohui detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics