Cargando…

Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts

It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide tr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sun, Liang, Luo, Haitao, Bu, Dechao, Zhao, Guoguang, Yu, Kuntao, Zhang, Changhai, Liu, Yuanning, Chen, Runsheng, Zhao, Yi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2013
Materias:	Methods Online
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783192/ https://www.ncbi.nlm.nih.gov/pubmed/23892401 http://dx.doi.org/10.1093/nar/gkt646

Descripción
Sumario:	It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense–antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.

Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts

Ejemplares similares