Cargando…
Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts
It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide tr...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783192/ https://www.ncbi.nlm.nih.gov/pubmed/23892401 http://dx.doi.org/10.1093/nar/gkt646 |
_version_ | 1782285640558182400 |
---|---|
author | Sun, Liang Luo, Haitao Bu, Dechao Zhao, Guoguang Yu, Kuntao Zhang, Changhai Liu, Yuanning Chen, Runsheng Zhao, Yi |
author_facet | Sun, Liang Luo, Haitao Bu, Dechao Zhao, Guoguang Yu, Kuntao Zhang, Changhai Liu, Yuanning Chen, Runsheng Zhao, Yi |
author_sort | Sun, Liang |
collection | PubMed |
description | It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense–antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci. |
format | Online Article Text |
id | pubmed-3783192 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-37831922013-09-30 Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts Sun, Liang Luo, Haitao Bu, Dechao Zhao, Guoguang Yu, Kuntao Zhang, Changhai Liu, Yuanning Chen, Runsheng Zhao, Yi Nucleic Acids Res Methods Online It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense–antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci. Oxford University Press 2013-09 2013-07-27 /pmc/articles/PMC3783192/ /pubmed/23892401 http://dx.doi.org/10.1093/nar/gkt646 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Sun, Liang Luo, Haitao Bu, Dechao Zhao, Guoguang Yu, Kuntao Zhang, Changhai Liu, Yuanning Chen, Runsheng Zhao, Yi Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
title | Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
title_full | Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
title_fullStr | Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
title_full_unstemmed | Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
title_short | Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
title_sort | utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783192/ https://www.ncbi.nlm.nih.gov/pubmed/23892401 http://dx.doi.org/10.1093/nar/gkt646 |
work_keys_str_mv | AT sunliang utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT luohaitao utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT budechao utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT zhaoguoguang utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT yukuntao utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT zhangchanghai utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT liuyuanning utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT chenrunsheng utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts AT zhaoyi utilizingsequenceintrinsiccompositiontoclassifyproteincodingandlongnoncodingtranscripts |