Cargando…

Semantically enabling a genome-wide association study database

BACKGROUND: The amount of data generated from genome-wide association studies (GWAS) has grown rapidly, but considerations for GWAS phenotype data reuse and interchange have not kept pace. This impacts on the work of GWAS Central – a free and open access resource for the advanced querying and compar...

Descripción completa

Detalles Bibliográficos
Autores principales: Beck, Tim, Free, Robert C, Thorisson, Gudmundur A, Brookes, Anthony J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3579732/
https://www.ncbi.nlm.nih.gov/pubmed/23244533
http://dx.doi.org/10.1186/2041-1480-3-9
_version_ 1782260153475661824
author Beck, Tim
Free, Robert C
Thorisson, Gudmundur A
Brookes, Anthony J
author_facet Beck, Tim
Free, Robert C
Thorisson, Gudmundur A
Brookes, Anthony J
author_sort Beck, Tim
collection PubMed
description BACKGROUND: The amount of data generated from genome-wide association studies (GWAS) has grown rapidly, but considerations for GWAS phenotype data reuse and interchange have not kept pace. This impacts on the work of GWAS Central – a free and open access resource for the advanced querying and comparison of summary-level genetic association data. The benefits of employing ontologies for standardising and structuring data are widely accepted. The complex spectrum of observed human phenotypes (and traits), and the requirement for cross-species phenotype comparisons, calls for reflection on the most appropriate solution for the organisation of human phenotype data. The Semantic Web provides standards for the possibility of further integration of GWAS data and the ability to contribute to the web of Linked Data. RESULTS: A pragmatic consideration when applying phenotype ontologies to GWAS data is the ability to retrieve all data, at the most granular level possible, from querying a single ontology graph. We found the Medical Subject Headings (MeSH) terminology suitable for describing all traits (diseases and medical signs and symptoms) at various levels of granularity and the Human Phenotype Ontology (HPO) most suitable for describing phenotypic abnormalities (medical signs and symptoms) at the most granular level. Diseases within MeSH are mapped to HPO to infer the phenotypic abnormalities associated with diseases. Building on the rich semantic phenotype annotation layer, we are able to make cross-species phenotype comparisons and publish a core subset of GWAS data as RDF nanopublications. CONCLUSIONS: We present a methodology for applying phenotype annotations to a comprehensive genome-wide association dataset and for ensuring compatibility with the Semantic Web. The annotations are used to assist with cross-species genotype and phenotype comparisons. However, further processing and deconstructions of terms may be required to facilitate automatic phenotype comparisons. The provision of GWAS nanopublications enables a new dimension for exploring GWAS data, by way of intrinsic links to related data resources within the Linked Data web. The value of such annotation and integration will grow as more biomedical resources adopt the standards of the Semantic Web.
format Online
Article
Text
id pubmed-3579732
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35797322013-02-23 Semantically enabling a genome-wide association study database Beck, Tim Free, Robert C Thorisson, Gudmundur A Brookes, Anthony J J Biomed Semantics Research BACKGROUND: The amount of data generated from genome-wide association studies (GWAS) has grown rapidly, but considerations for GWAS phenotype data reuse and interchange have not kept pace. This impacts on the work of GWAS Central – a free and open access resource for the advanced querying and comparison of summary-level genetic association data. The benefits of employing ontologies for standardising and structuring data are widely accepted. The complex spectrum of observed human phenotypes (and traits), and the requirement for cross-species phenotype comparisons, calls for reflection on the most appropriate solution for the organisation of human phenotype data. The Semantic Web provides standards for the possibility of further integration of GWAS data and the ability to contribute to the web of Linked Data. RESULTS: A pragmatic consideration when applying phenotype ontologies to GWAS data is the ability to retrieve all data, at the most granular level possible, from querying a single ontology graph. We found the Medical Subject Headings (MeSH) terminology suitable for describing all traits (diseases and medical signs and symptoms) at various levels of granularity and the Human Phenotype Ontology (HPO) most suitable for describing phenotypic abnormalities (medical signs and symptoms) at the most granular level. Diseases within MeSH are mapped to HPO to infer the phenotypic abnormalities associated with diseases. Building on the rich semantic phenotype annotation layer, we are able to make cross-species phenotype comparisons and publish a core subset of GWAS data as RDF nanopublications. CONCLUSIONS: We present a methodology for applying phenotype annotations to a comprehensive genome-wide association dataset and for ensuring compatibility with the Semantic Web. The annotations are used to assist with cross-species genotype and phenotype comparisons. However, further processing and deconstructions of terms may be required to facilitate automatic phenotype comparisons. The provision of GWAS nanopublications enables a new dimension for exploring GWAS data, by way of intrinsic links to related data resources within the Linked Data web. The value of such annotation and integration will grow as more biomedical resources adopt the standards of the Semantic Web. BioMed Central 2012-12-17 /pmc/articles/PMC3579732/ /pubmed/23244533 http://dx.doi.org/10.1186/2041-1480-3-9 Text en Copyright ©2012 Beck et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Beck, Tim
Free, Robert C
Thorisson, Gudmundur A
Brookes, Anthony J
Semantically enabling a genome-wide association study database
title Semantically enabling a genome-wide association study database
title_full Semantically enabling a genome-wide association study database
title_fullStr Semantically enabling a genome-wide association study database
title_full_unstemmed Semantically enabling a genome-wide association study database
title_short Semantically enabling a genome-wide association study database
title_sort semantically enabling a genome-wide association study database
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3579732/
https://www.ncbi.nlm.nih.gov/pubmed/23244533
http://dx.doi.org/10.1186/2041-1480-3-9
work_keys_str_mv AT becktim semanticallyenablingagenomewideassociationstudydatabase
AT freerobertc semanticallyenablingagenomewideassociationstudydatabase
AT thorissongudmundura semanticallyenablingagenomewideassociationstudydatabase
AT brookesanthonyj semanticallyenablingagenomewideassociationstudydatabase