Cargando…

Creating reference gene annotation for the mouse C57BL6/J genome assembly

Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, an...

Descripción completa

Detalles Bibliográficos
Autores principales: Mudge, Jonathan M., Harrow, Jennifer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4602055/
https://www.ncbi.nlm.nih.gov/pubmed/26187010
http://dx.doi.org/10.1007/s00335-015-9583-x
Descripción
Sumario:Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00335-015-9583-x) contains supplementary material, which is available to authorized users.