Cargando…
Improving the Sequence Ontology terminology for genomic variant annotation
BACKGROUND: The Genome Variant Format (GVF) uses the Sequence Ontology (SO) to enable detailed annotation of sequence variation. The annotation includes SO terms for the type of sequence alteration, the genomic features that are changed and the effect of the alteration. The SO maintains and updates...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4520272/ https://www.ncbi.nlm.nih.gov/pubmed/26229585 http://dx.doi.org/10.1186/s13326-015-0030-4 |
_version_ | 1782383638972727296 |
---|---|
author | Cunningham, Fiona Moore, Barry Ruiz-Schultz, Nicole Ritchie, Graham RS Eilbeck, Karen |
author_facet | Cunningham, Fiona Moore, Barry Ruiz-Schultz, Nicole Ritchie, Graham RS Eilbeck, Karen |
author_sort | Cunningham, Fiona |
collection | PubMed |
description | BACKGROUND: The Genome Variant Format (GVF) uses the Sequence Ontology (SO) to enable detailed annotation of sequence variation. The annotation includes SO terms for the type of sequence alteration, the genomic features that are changed and the effect of the alteration. The SO maintains and updates the specification and provides the underlying ontologicial structure. METHODS: A requirements analysis was undertaken to gather terms missing in the SO release at the time, but needed to adequately describe the effects of sequence alteration on a set of variant genomic annotations. We have extended and remodeled the SO to include and define all terms that describe the effect of variation upon reference genomic features in the Ensembl variation databases. RESULTS: The new terminology was used to annotate the human reference genome with a set of variants from both COSMIC and dbSNP. A GVF file containing 170,853 sequence alterations was generated using the SO terminology to annotate the kinds of alteration, the effect of the alteration and the reference feature changed. There are four kinds of alteration and 24 kinds of effect seen in this dataset. (Ensembl Variation annotates 34 different SO consequence terms: http://www.ensembl.org/info/docs/variation/predicted_data.html). CONCLUSIONS: We explain the updates to the Sequence Ontology to describe the effect of variation on existing reference features. We have provided a set of annotations using this terminology, and the well defined GVF specification. We have also provided a provisional exploration of this large annotation dataset. |
format | Online Article Text |
id | pubmed-4520272 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45202722015-07-31 Improving the Sequence Ontology terminology for genomic variant annotation Cunningham, Fiona Moore, Barry Ruiz-Schultz, Nicole Ritchie, Graham RS Eilbeck, Karen J Biomed Semantics Short Report BACKGROUND: The Genome Variant Format (GVF) uses the Sequence Ontology (SO) to enable detailed annotation of sequence variation. The annotation includes SO terms for the type of sequence alteration, the genomic features that are changed and the effect of the alteration. The SO maintains and updates the specification and provides the underlying ontologicial structure. METHODS: A requirements analysis was undertaken to gather terms missing in the SO release at the time, but needed to adequately describe the effects of sequence alteration on a set of variant genomic annotations. We have extended and remodeled the SO to include and define all terms that describe the effect of variation upon reference genomic features in the Ensembl variation databases. RESULTS: The new terminology was used to annotate the human reference genome with a set of variants from both COSMIC and dbSNP. A GVF file containing 170,853 sequence alterations was generated using the SO terminology to annotate the kinds of alteration, the effect of the alteration and the reference feature changed. There are four kinds of alteration and 24 kinds of effect seen in this dataset. (Ensembl Variation annotates 34 different SO consequence terms: http://www.ensembl.org/info/docs/variation/predicted_data.html). CONCLUSIONS: We explain the updates to the Sequence Ontology to describe the effect of variation on existing reference features. We have provided a set of annotations using this terminology, and the well defined GVF specification. We have also provided a provisional exploration of this large annotation dataset. BioMed Central 2015-07-31 /pmc/articles/PMC4520272/ /pubmed/26229585 http://dx.doi.org/10.1186/s13326-015-0030-4 Text en © Cunningham et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Short Report Cunningham, Fiona Moore, Barry Ruiz-Schultz, Nicole Ritchie, Graham RS Eilbeck, Karen Improving the Sequence Ontology terminology for genomic variant annotation |
title | Improving the Sequence Ontology terminology for genomic variant annotation |
title_full | Improving the Sequence Ontology terminology for genomic variant annotation |
title_fullStr | Improving the Sequence Ontology terminology for genomic variant annotation |
title_full_unstemmed | Improving the Sequence Ontology terminology for genomic variant annotation |
title_short | Improving the Sequence Ontology terminology for genomic variant annotation |
title_sort | improving the sequence ontology terminology for genomic variant annotation |
topic | Short Report |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4520272/ https://www.ncbi.nlm.nih.gov/pubmed/26229585 http://dx.doi.org/10.1186/s13326-015-0030-4 |
work_keys_str_mv | AT cunninghamfiona improvingthesequenceontologyterminologyforgenomicvariantannotation AT moorebarry improvingthesequenceontologyterminologyforgenomicvariantannotation AT ruizschultznicole improvingthesequenceontologyterminologyforgenomicvariantannotation AT ritchiegrahamrs improvingthesequenceontologyterminologyforgenomicvariantannotation AT eilbeckkaren improvingthesequenceontologyterminologyforgenomicvariantannotation |