Cargando…
UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation
When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4447347/ https://www.ncbi.nlm.nih.gov/pubmed/26020645 http://dx.doi.org/10.1371/journal.pone.0128026 |
_version_ | 1782373576587870208 |
---|---|
author | Jackman, Shaun D. Bohlmann, Joerg Birol, İnanç |
author_facet | Jackman, Shaun D. Bohlmann, Joerg Birol, İnanç |
author_sort | Jackman, Shaun D. |
collection | PubMed |
description | When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene that is a representative k-mer, a string of length k, selected from the sequence of that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without requiring that previous annotations be lifted over by sequence alignment. We assign UniqTag identifiers to ten builds of the Ensembl human genome spanning eight years to demonstrate this stability. The implementation of UniqTag in Ruby and an R package are available at https://github.com/sjackman/uniqtag sjackman/uniqtag. The R package is also available from CRAN: install.packages ("uniqtag"). Supplementary material and code to reproduce it is available at https://github.com/sjackman/uniqtag-paper. |
format | Online Article Text |
id | pubmed-4447347 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-44473472015-06-09 UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation Jackman, Shaun D. Bohlmann, Joerg Birol, İnanç PLoS One Research Article When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene that is a representative k-mer, a string of length k, selected from the sequence of that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without requiring that previous annotations be lifted over by sequence alignment. We assign UniqTag identifiers to ten builds of the Ensembl human genome spanning eight years to demonstrate this stability. The implementation of UniqTag in Ruby and an R package are available at https://github.com/sjackman/uniqtag sjackman/uniqtag. The R package is also available from CRAN: install.packages ("uniqtag"). Supplementary material and code to reproduce it is available at https://github.com/sjackman/uniqtag-paper. Public Library of Science 2015-05-28 /pmc/articles/PMC4447347/ /pubmed/26020645 http://dx.doi.org/10.1371/journal.pone.0128026 Text en © 2015 Jackman et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Jackman, Shaun D. Bohlmann, Joerg Birol, İnanç UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation |
title | UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation |
title_full | UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation |
title_fullStr | UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation |
title_full_unstemmed | UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation |
title_short | UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation |
title_sort | uniqtag: content-derived unique and stable identifiers for gene annotation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4447347/ https://www.ncbi.nlm.nih.gov/pubmed/26020645 http://dx.doi.org/10.1371/journal.pone.0128026 |
work_keys_str_mv | AT jackmanshaund uniqtagcontentderiveduniqueandstableidentifiersforgeneannotation AT bohlmannjoerg uniqtagcontentderiveduniqueandstableidentifiersforgeneannotation AT birolinanc uniqtagcontentderiveduniqueandstableidentifiersforgeneannotation |