Cargando…

Wikidata as a semantic framework for the Gene Wiki initiative

Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia. In order to improve the state of biolog...

Descripción completa

Detalles Bibliográficos
Autores principales: Burgstaller-Muehlbacher, Sebastian, Waagmeester, Andra, Mitraka, Elvira, Turner, Julia, Putman, Tim, Leong, Justin, Naik, Chinmay, Pavlidis, Paul, Schriml, Lynn, Good, Benjamin M, Su, Andrew I
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795929/
https://www.ncbi.nlm.nih.gov/pubmed/26989148
http://dx.doi.org/10.1093/database/baw015
_version_ 1782421686034890752
author Burgstaller-Muehlbacher, Sebastian
Waagmeester, Andra
Mitraka, Elvira
Turner, Julia
Putman, Tim
Leong, Justin
Naik, Chinmay
Pavlidis, Paul
Schriml, Lynn
Good, Benjamin M
Su, Andrew I
author_facet Burgstaller-Muehlbacher, Sebastian
Waagmeester, Andra
Mitraka, Elvira
Turner, Julia
Putman, Tim
Leong, Justin
Naik, Chinmay
Pavlidis, Paul
Schriml, Lynn
Good, Benjamin M
Su, Andrew I
author_sort Burgstaller-Muehlbacher, Sebastian
collection PubMed
description Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia. In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59 721 human genes and 73 355 mouse genes have been imported from NCBI and 27 306 human proteins and 16 728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike. The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for usage of the data we imported in all 280 different language Wikipedias. Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists. In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web. Database URL: https://www.wikidata.org/
format Online
Article
Text
id pubmed-4795929
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47959292016-03-21 Wikidata as a semantic framework for the Gene Wiki initiative Burgstaller-Muehlbacher, Sebastian Waagmeester, Andra Mitraka, Elvira Turner, Julia Putman, Tim Leong, Justin Naik, Chinmay Pavlidis, Paul Schriml, Lynn Good, Benjamin M Su, Andrew I Database (Oxford) Original Article Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia. In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59 721 human genes and 73 355 mouse genes have been imported from NCBI and 27 306 human proteins and 16 728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike. The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for usage of the data we imported in all 280 different language Wikipedias. Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists. In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web. Database URL: https://www.wikidata.org/ Oxford University Press 2016-03-17 /pmc/articles/PMC4795929/ /pubmed/26989148 http://dx.doi.org/10.1093/database/baw015 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Burgstaller-Muehlbacher, Sebastian
Waagmeester, Andra
Mitraka, Elvira
Turner, Julia
Putman, Tim
Leong, Justin
Naik, Chinmay
Pavlidis, Paul
Schriml, Lynn
Good, Benjamin M
Su, Andrew I
Wikidata as a semantic framework for the Gene Wiki initiative
title Wikidata as a semantic framework for the Gene Wiki initiative
title_full Wikidata as a semantic framework for the Gene Wiki initiative
title_fullStr Wikidata as a semantic framework for the Gene Wiki initiative
title_full_unstemmed Wikidata as a semantic framework for the Gene Wiki initiative
title_short Wikidata as a semantic framework for the Gene Wiki initiative
title_sort wikidata as a semantic framework for the gene wiki initiative
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795929/
https://www.ncbi.nlm.nih.gov/pubmed/26989148
http://dx.doi.org/10.1093/database/baw015
work_keys_str_mv AT burgstallermuehlbachersebastian wikidataasasemanticframeworkforthegenewikiinitiative
AT waagmeesterandra wikidataasasemanticframeworkforthegenewikiinitiative
AT mitrakaelvira wikidataasasemanticframeworkforthegenewikiinitiative
AT turnerjulia wikidataasasemanticframeworkforthegenewikiinitiative
AT putmantim wikidataasasemanticframeworkforthegenewikiinitiative
AT leongjustin wikidataasasemanticframeworkforthegenewikiinitiative
AT naikchinmay wikidataasasemanticframeworkforthegenewikiinitiative
AT pavlidispaul wikidataasasemanticframeworkforthegenewikiinitiative
AT schrimllynn wikidataasasemanticframeworkforthegenewikiinitiative
AT goodbenjaminm wikidataasasemanticframeworkforthegenewikiinitiative
AT suandrewi wikidataasasemanticframeworkforthegenewikiinitiative