Cargando…

FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation

BACKGROUND: Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information...

Descripción completa

Detalles Bibliográficos
Autores principales: Bolleman, Jerven T., Mungall, Christopher J., Strozzi, Francesco, Baran, Joachim, Dumontier, Michel, Bonnal, Raoul J. P., Buels, Robert, Hoehndorf, Robert, Fujisawa, Takatomo, Katayama, Toshiaki, Cock, Peter J. A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4907002/
https://www.ncbi.nlm.nih.gov/pubmed/27296299
http://dx.doi.org/10.1186/s13326-016-0067-z
_version_ 1782437500875177984
author Bolleman, Jerven T.
Mungall, Christopher J.
Strozzi, Francesco
Baran, Joachim
Dumontier, Michel
Bonnal, Raoul J. P.
Buels, Robert
Hoehndorf, Robert
Fujisawa, Takatomo
Katayama, Toshiaki
Cock, Peter J. A.
author_facet Bolleman, Jerven T.
Mungall, Christopher J.
Strozzi, Francesco
Baran, Joachim
Dumontier, Michel
Bonnal, Raoul J. P.
Buels, Robert
Hoehndorf, Robert
Fujisawa, Takatomo
Katayama, Toshiaki
Cock, Peter J. A.
author_sort Bolleman, Jerven T.
collection PubMed
description BACKGROUND: Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. DESCRIPTION: We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. CONCLUSIONS: Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.
format Online
Article
Text
id pubmed-4907002
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49070022016-06-15 FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation Bolleman, Jerven T. Mungall, Christopher J. Strozzi, Francesco Baran, Joachim Dumontier, Michel Bonnal, Raoul J. P. Buels, Robert Hoehndorf, Robert Fujisawa, Takatomo Katayama, Toshiaki Cock, Peter J. A. J Biomed Semantics Research BACKGROUND: Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. DESCRIPTION: We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. CONCLUSIONS: Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores. BioMed Central 2016-06-13 /pmc/articles/PMC4907002/ /pubmed/27296299 http://dx.doi.org/10.1186/s13326-016-0067-z Text en © Bolleman et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Bolleman, Jerven T.
Mungall, Christopher J.
Strozzi, Francesco
Baran, Joachim
Dumontier, Michel
Bonnal, Raoul J. P.
Buels, Robert
Hoehndorf, Robert
Fujisawa, Takatomo
Katayama, Toshiaki
Cock, Peter J. A.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
title FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
title_full FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
title_fullStr FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
title_full_unstemmed FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
title_short FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
title_sort faldo: a semantic standard for describing the location of nucleotide and protein feature annotation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4907002/
https://www.ncbi.nlm.nih.gov/pubmed/27296299
http://dx.doi.org/10.1186/s13326-016-0067-z
work_keys_str_mv AT bollemanjervent faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT mungallchristopherj faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT strozzifrancesco faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT baranjoachim faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT dumontiermichel faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT bonnalraouljp faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT buelsrobert faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT hoehndorfrobert faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT fujisawatakatomo faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT katayamatoshiaki faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation
AT cockpeterja faldoasemanticstandardfordescribingthelocationofnucleotideandproteinfeatureannotation