Cargando…
Preserving sequence annotations across reference sequences
BACKGROUND: Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4108922/ https://www.ncbi.nlm.nih.gov/pubmed/25093075 http://dx.doi.org/10.1186/2041-1480-5-S1-S6 |
_version_ | 1782327810727084032 |
---|---|
author | Tatum, Zuotian Roos, Marco Gibson, Andrew P Taschner, Peter EM Thompson, Mark Schultes, Erik A Laros, Jeroen FJ |
author_facet | Tatum, Zuotian Roos, Marco Gibson, Andrew P Taschner, Peter EM Thompson, Mark Schultes, Erik A Laros, Jeroen FJ |
author_sort | Tatum, Zuotian |
collection | PubMed |
description | BACKGROUND: Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. RESULTS: As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. CONCLUSIONS: We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO. |
format | Online Article Text |
id | pubmed-4108922 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41089222014-08-04 Preserving sequence annotations across reference sequences Tatum, Zuotian Roos, Marco Gibson, Andrew P Taschner, Peter EM Thompson, Mark Schultes, Erik A Laros, Jeroen FJ J Biomed Semantics Proceedings BACKGROUND: Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. RESULTS: As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. CONCLUSIONS: We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO. BioMed Central 2014-06-03 /pmc/articles/PMC4108922/ /pubmed/25093075 http://dx.doi.org/10.1186/2041-1480-5-S1-S6 Text en Copyright © 2014 Tatum et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Tatum, Zuotian Roos, Marco Gibson, Andrew P Taschner, Peter EM Thompson, Mark Schultes, Erik A Laros, Jeroen FJ Preserving sequence annotations across reference sequences |
title | Preserving sequence annotations across reference sequences |
title_full | Preserving sequence annotations across reference sequences |
title_fullStr | Preserving sequence annotations across reference sequences |
title_full_unstemmed | Preserving sequence annotations across reference sequences |
title_short | Preserving sequence annotations across reference sequences |
title_sort | preserving sequence annotations across reference sequences |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4108922/ https://www.ncbi.nlm.nih.gov/pubmed/25093075 http://dx.doi.org/10.1186/2041-1480-5-S1-S6 |
work_keys_str_mv | AT tatumzuotian preservingsequenceannotationsacrossreferencesequences AT roosmarco preservingsequenceannotationsacrossreferencesequences AT gibsonandrewp preservingsequenceannotationsacrossreferencesequences AT taschnerpeterem preservingsequenceannotationsacrossreferencesequences AT thompsonmark preservingsequenceannotationsacrossreferencesequences AT schulteserika preservingsequenceannotationsacrossreferencesequences AT larosjeroenfj preservingsequenceannotationsacrossreferencesequences |