Cargando…

Tbl2KnownGene: A command-line program to convert NCBI.tbl to UCSC knownGene.txt data file

The schema for UCSC Known Genes (knownGene.txt) has been widely adopted for use in both standard and custom downstream analysis tools/scripts. For many popular model organisms (e.g. Arabidopsis), sequence and annotation data tables (including “knownGene.txt”) have not yet been made available to the...

Descripción completa

Detalles Bibliográficos
Autor principal: Bai, Yongsheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Biomedical Informatics 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4166776/
https://www.ncbi.nlm.nih.gov/pubmed/25258492
http://dx.doi.org/10.6026/97320630010544
Descripción
Sumario:The schema for UCSC Known Genes (knownGene.txt) has been widely adopted for use in both standard and custom downstream analysis tools/scripts. For many popular model organisms (e.g. Arabidopsis), sequence and annotation data tables (including “knownGene.txt”) have not yet been made available to the public. Therefore, it is of interest to describe Tbl2KnownGene, a .tbl file parser that can process the contents of a NCBI .tbl file and produce a UCSC Known Genes annotation feature table. The algorithm is tested with chromosome datasets from Arabidopsis genome (TAIR10). The Tbl2KnownGene parser finds utility for data with other organisms having similar .tbl annotations. AVAILABILITY: Perl scripts and required input files are available on the web at http://thoth.indstate.edu/~ybai2/Tbl2KnownGene/ index.html