Cargando…

Testing the Coding Potential of Conserved Short Genomic Sequences

Proposed is a procedure to test whether a genomic sequence contains coding DNA, called a coding potential region. The procedure tests the coding potential of conserved short genomic sequence, in which the assumptions on the probability models of gene structures are relaxed. Thus, it is expected to p...

Descripción completa

Detalles Bibliográficos
Autor principal: Wu, Jing
Formato: Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2834954/
https://www.ncbi.nlm.nih.gov/pubmed/20224812
http://dx.doi.org/10.1155/2010/287070
_version_ 1782178624743407616
author Wu, Jing
author_facet Wu, Jing
author_sort Wu, Jing
collection PubMed
description Proposed is a procedure to test whether a genomic sequence contains coding DNA, called a coding potential region. The procedure tests the coding potential of conserved short genomic sequence, in which the assumptions on the probability models of gene structures are relaxed. Thus, it is expected to provide additional candidate regions that contain coding DNAs to the current genomic database. The procedure was applied to the set of highly conserved human-mouse sequences in the genome database at the University of California at Santa Cruz. For sequences containing RefSeq coding exons, the procedure detected 91.3% regions having coding potential in this set, which covers 83% of the human RefSeq coding exons, at a 2.6% false positive rate. The procedure detected 12,688 novel short regions with coding potential at the false discovery rate <0.05; 65.7% of the novel regions are between annotated genes.
format Text
id pubmed-2834954
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-28349542010-03-11 Testing the Coding Potential of Conserved Short Genomic Sequences Wu, Jing Adv Bioinformatics Research Article Proposed is a procedure to test whether a genomic sequence contains coding DNA, called a coding potential region. The procedure tests the coding potential of conserved short genomic sequence, in which the assumptions on the probability models of gene structures are relaxed. Thus, it is expected to provide additional candidate regions that contain coding DNAs to the current genomic database. The procedure was applied to the set of highly conserved human-mouse sequences in the genome database at the University of California at Santa Cruz. For sequences containing RefSeq coding exons, the procedure detected 91.3% regions having coding potential in this set, which covers 83% of the human RefSeq coding exons, at a 2.6% false positive rate. The procedure detected 12,688 novel short regions with coding potential at the false discovery rate <0.05; 65.7% of the novel regions are between annotated genes. Hindawi Publishing Corporation 2010 2010-03-08 /pmc/articles/PMC2834954/ /pubmed/20224812 http://dx.doi.org/10.1155/2010/287070 Text en Copyright © 2010 Jing Wu. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wu, Jing
Testing the Coding Potential of Conserved Short Genomic Sequences
title Testing the Coding Potential of Conserved Short Genomic Sequences
title_full Testing the Coding Potential of Conserved Short Genomic Sequences
title_fullStr Testing the Coding Potential of Conserved Short Genomic Sequences
title_full_unstemmed Testing the Coding Potential of Conserved Short Genomic Sequences
title_short Testing the Coding Potential of Conserved Short Genomic Sequences
title_sort testing the coding potential of conserved short genomic sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2834954/
https://www.ncbi.nlm.nih.gov/pubmed/20224812
http://dx.doi.org/10.1155/2010/287070
work_keys_str_mv AT wujing testingthecodingpotentialofconservedshortgenomicsequences