Cargando…

Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms

Next-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databas...

Descripción completa

Detalles Bibliográficos
Autores principales: Bick, Jochen T, Zeng, Shuqin, Robinson, Mark D, Ulbrich, Susanne E, Bauersachs, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6661403/
https://www.ncbi.nlm.nih.gov/pubmed/31353404
http://dx.doi.org/10.1093/database/baz086
_version_ 1783439443272663040
author Bick, Jochen T
Zeng, Shuqin
Robinson, Mark D
Ulbrich, Susanne E
Bauersachs, Stefan
author_facet Bick, Jochen T
Zeng, Shuqin
Robinson, Mark D
Ulbrich, Susanne E
Bauersachs, Stefan
author_sort Bick, Jochen T
collection PubMed
description Next-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databases can lead to a substantial loss of information in downstream data analysis. To overcome this, we developed Mammalian Annotation Database tool (MAdb, https://madb.ethz.ch) to conveniently provide homologous gene information for selected mammalian species. The assignment between species is performed in three steps: (i) matching official gene symbols, (ii) using ortholog information contained in Ensembl Compara and (iii) pairwise BLAST comparisons of all transcripts. In addition, we developed a new tool (AnnOverlappeR) for the reliable assignment of the National Center for Biotechnology Information (NCBI) and Ensembl gene IDs. The gene lists translated to gene IDs of well-annotated species such as a human can be used for improved functional annotation with relevant tools based on Gene Ontology and molecular pathway information. We tested the MAdb on a published RNA-seq data set for the pig and showed clearly improved overrepresentation analysis results based on the assigned human homologous gene identifiers. Using the MAdb revealed a similar list of human homologous genes and functional annotation results regardless of whether starting with gene IDs from NCBI or Ensembl. The MAdb database is accessible via a web interface and a Galaxy application.
format Online
Article
Text
id pubmed-6661403
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-66614032019-08-02 Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms Bick, Jochen T Zeng, Shuqin Robinson, Mark D Ulbrich, Susanne E Bauersachs, Stefan Database (Oxford) Database Tool Next-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databases can lead to a substantial loss of information in downstream data analysis. To overcome this, we developed Mammalian Annotation Database tool (MAdb, https://madb.ethz.ch) to conveniently provide homologous gene information for selected mammalian species. The assignment between species is performed in three steps: (i) matching official gene symbols, (ii) using ortholog information contained in Ensembl Compara and (iii) pairwise BLAST comparisons of all transcripts. In addition, we developed a new tool (AnnOverlappeR) for the reliable assignment of the National Center for Biotechnology Information (NCBI) and Ensembl gene IDs. The gene lists translated to gene IDs of well-annotated species such as a human can be used for improved functional annotation with relevant tools based on Gene Ontology and molecular pathway information. We tested the MAdb on a published RNA-seq data set for the pig and showed clearly improved overrepresentation analysis results based on the assigned human homologous gene identifiers. Using the MAdb revealed a similar list of human homologous genes and functional annotation results regardless of whether starting with gene IDs from NCBI or Ensembl. The MAdb database is accessible via a web interface and a Galaxy application. Oxford University Press 2019-07-26 /pmc/articles/PMC6661403/ /pubmed/31353404 http://dx.doi.org/10.1093/database/baz086 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Tool
Bick, Jochen T
Zeng, Shuqin
Robinson, Mark D
Ulbrich, Susanne E
Bauersachs, Stefan
Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
title Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
title_full Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
title_fullStr Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
title_full_unstemmed Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
title_short Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
title_sort mammalian annotation database for improved annotation and functional classification of omics datasets from less well-annotated organisms
topic Database Tool
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6661403/
https://www.ncbi.nlm.nih.gov/pubmed/31353404
http://dx.doi.org/10.1093/database/baz086
work_keys_str_mv AT bickjochent mammalianannotationdatabaseforimprovedannotationandfunctionalclassificationofomicsdatasetsfromlesswellannotatedorganisms
AT zengshuqin mammalianannotationdatabaseforimprovedannotationandfunctionalclassificationofomicsdatasetsfromlesswellannotatedorganisms
AT robinsonmarkd mammalianannotationdatabaseforimprovedannotationandfunctionalclassificationofomicsdatasetsfromlesswellannotatedorganisms
AT ulbrichsusannee mammalianannotationdatabaseforimprovedannotationandfunctionalclassificationofomicsdatasetsfromlesswellannotatedorganisms
AT bauersachsstefan mammalianannotationdatabaseforimprovedannotationandfunctionalclassificationofomicsdatasetsfromlesswellannotatedorganisms