Cargando…

A resource to explore the discovery of rare diseases and their causative genes

Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked...

Descripción completa

Detalles Bibliográficos
Autores principales: Ehrhart, Friederike, Willighagen, Egon L., Kutmon, Martina, van Hoften, Max, Curfs, Leopold M. G., Evelo, Chris T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096966/
https://www.ncbi.nlm.nih.gov/pubmed/33947870
http://dx.doi.org/10.1038/s41597-021-00905-y
_version_ 1783688253354803200
author Ehrhart, Friederike
Willighagen, Egon L.
Kutmon, Martina
van Hoften, Max
Curfs, Leopold M. G.
Evelo, Chris T.
author_facet Ehrhart, Friederike
Willighagen, Egon L.
Kutmon, Martina
van Hoften, Max
Curfs, Leopold M. G.
Evelo, Chris T.
author_sort Ehrhart, Friederike
collection PubMed
description Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked them to 3163 causative genes, annotated with OMIM and Ensembl identifiers and HGNC symbols. The PubMed identifiers of the scientific publications, which for the first time described the rare diseases, and the publications, which found the genes causing the diseases were added using information from OMIM, PubMed, Wikipedia, whonamedit.com, and Google Scholar. The data are available under CC0 license as spreadsheet and as RDF in a semantic model modified from DisGeNET, and was added to Wikidata. This dataset relies on publicly available data and publications with a PubMed identifier, but by our effort to make the data interoperable and linked, we can now analyse this data. Our analysis revealed the timeline of rare disease and causative gene discovery and links them to developments in methods.
format Online
Article
Text
id pubmed-8096966
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80969662021-05-05 A resource to explore the discovery of rare diseases and their causative genes Ehrhart, Friederike Willighagen, Egon L. Kutmon, Martina van Hoften, Max Curfs, Leopold M. G. Evelo, Chris T. Sci Data Data Descriptor Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked them to 3163 causative genes, annotated with OMIM and Ensembl identifiers and HGNC symbols. The PubMed identifiers of the scientific publications, which for the first time described the rare diseases, and the publications, which found the genes causing the diseases were added using information from OMIM, PubMed, Wikipedia, whonamedit.com, and Google Scholar. The data are available under CC0 license as spreadsheet and as RDF in a semantic model modified from DisGeNET, and was added to Wikidata. This dataset relies on publicly available data and publications with a PubMed identifier, but by our effort to make the data interoperable and linked, we can now analyse this data. Our analysis revealed the timeline of rare disease and causative gene discovery and links them to developments in methods. Nature Publishing Group UK 2021-05-04 /pmc/articles/PMC8096966/ /pubmed/33947870 http://dx.doi.org/10.1038/s41597-021-00905-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) applies to the metadata files associated with this article.
spellingShingle Data Descriptor
Ehrhart, Friederike
Willighagen, Egon L.
Kutmon, Martina
van Hoften, Max
Curfs, Leopold M. G.
Evelo, Chris T.
A resource to explore the discovery of rare diseases and their causative genes
title A resource to explore the discovery of rare diseases and their causative genes
title_full A resource to explore the discovery of rare diseases and their causative genes
title_fullStr A resource to explore the discovery of rare diseases and their causative genes
title_full_unstemmed A resource to explore the discovery of rare diseases and their causative genes
title_short A resource to explore the discovery of rare diseases and their causative genes
title_sort resource to explore the discovery of rare diseases and their causative genes
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096966/
https://www.ncbi.nlm.nih.gov/pubmed/33947870
http://dx.doi.org/10.1038/s41597-021-00905-y
work_keys_str_mv AT ehrhartfriederike aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT willighagenegonl aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT kutmonmartina aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT vanhoftenmax aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT curfsleopoldmg aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT evelochrist aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT ehrhartfriederike resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT willighagenegonl resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT kutmonmartina resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT vanhoftenmax resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT curfsleopoldmg resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes
AT evelochrist resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes