Cargando…

Preserving sequence annotations across reference sequences

BACKGROUND: Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone...

Descripción completa

Detalles Bibliográficos
Autores principales: Tatum, Zuotian, Roos, Marco, Gibson, Andrew P, Taschner, Peter EM, Thompson, Mark, Schultes, Erik A, Laros, Jeroen FJ
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4108922/
https://www.ncbi.nlm.nih.gov/pubmed/25093075
http://dx.doi.org/10.1186/2041-1480-5-S1-S6
_version_ 1782327810727084032
author Tatum, Zuotian
Roos, Marco
Gibson, Andrew P
Taschner, Peter EM
Thompson, Mark
Schultes, Erik A
Laros, Jeroen FJ
author_facet Tatum, Zuotian
Roos, Marco
Gibson, Andrew P
Taschner, Peter EM
Thompson, Mark
Schultes, Erik A
Laros, Jeroen FJ
author_sort Tatum, Zuotian
collection PubMed
description BACKGROUND: Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. RESULTS: As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. CONCLUSIONS: We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO.
format Online
Article
Text
id pubmed-4108922
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41089222014-08-04 Preserving sequence annotations across reference sequences Tatum, Zuotian Roos, Marco Gibson, Andrew P Taschner, Peter EM Thompson, Mark Schultes, Erik A Laros, Jeroen FJ J Biomed Semantics Proceedings BACKGROUND: Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. RESULTS: As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. CONCLUSIONS: We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO. BioMed Central 2014-06-03 /pmc/articles/PMC4108922/ /pubmed/25093075 http://dx.doi.org/10.1186/2041-1480-5-S1-S6 Text en Copyright © 2014 Tatum et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Tatum, Zuotian
Roos, Marco
Gibson, Andrew P
Taschner, Peter EM
Thompson, Mark
Schultes, Erik A
Laros, Jeroen FJ
Preserving sequence annotations across reference sequences
title Preserving sequence annotations across reference sequences
title_full Preserving sequence annotations across reference sequences
title_fullStr Preserving sequence annotations across reference sequences
title_full_unstemmed Preserving sequence annotations across reference sequences
title_short Preserving sequence annotations across reference sequences
title_sort preserving sequence annotations across reference sequences
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4108922/
https://www.ncbi.nlm.nih.gov/pubmed/25093075
http://dx.doi.org/10.1186/2041-1480-5-S1-S6
work_keys_str_mv AT tatumzuotian preservingsequenceannotationsacrossreferencesequences
AT roosmarco preservingsequenceannotationsacrossreferencesequences
AT gibsonandrewp preservingsequenceannotationsacrossreferencesequences
AT taschnerpeterem preservingsequenceannotationsacrossreferencesequences
AT thompsonmark preservingsequenceannotationsacrossreferencesequences
AT schulteserika preservingsequenceannotationsacrossreferencesequences
AT larosjeroenfj preservingsequenceannotationsacrossreferencesequences