Cargando…
A resource to explore the discovery of rare diseases and their causative genes
Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096966/ https://www.ncbi.nlm.nih.gov/pubmed/33947870 http://dx.doi.org/10.1038/s41597-021-00905-y |
_version_ | 1783688253354803200 |
---|---|
author | Ehrhart, Friederike Willighagen, Egon L. Kutmon, Martina van Hoften, Max Curfs, Leopold M. G. Evelo, Chris T. |
author_facet | Ehrhart, Friederike Willighagen, Egon L. Kutmon, Martina van Hoften, Max Curfs, Leopold M. G. Evelo, Chris T. |
author_sort | Ehrhart, Friederike |
collection | PubMed |
description | Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked them to 3163 causative genes, annotated with OMIM and Ensembl identifiers and HGNC symbols. The PubMed identifiers of the scientific publications, which for the first time described the rare diseases, and the publications, which found the genes causing the diseases were added using information from OMIM, PubMed, Wikipedia, whonamedit.com, and Google Scholar. The data are available under CC0 license as spreadsheet and as RDF in a semantic model modified from DisGeNET, and was added to Wikidata. This dataset relies on publicly available data and publications with a PubMed identifier, but by our effort to make the data interoperable and linked, we can now analyse this data. Our analysis revealed the timeline of rare disease and causative gene discovery and links them to developments in methods. |
format | Online Article Text |
id | pubmed-8096966 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-80969662021-05-05 A resource to explore the discovery of rare diseases and their causative genes Ehrhart, Friederike Willighagen, Egon L. Kutmon, Martina van Hoften, Max Curfs, Leopold M. G. Evelo, Chris T. Sci Data Data Descriptor Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked them to 3163 causative genes, annotated with OMIM and Ensembl identifiers and HGNC symbols. The PubMed identifiers of the scientific publications, which for the first time described the rare diseases, and the publications, which found the genes causing the diseases were added using information from OMIM, PubMed, Wikipedia, whonamedit.com, and Google Scholar. The data are available under CC0 license as spreadsheet and as RDF in a semantic model modified from DisGeNET, and was added to Wikidata. This dataset relies on publicly available data and publications with a PubMed identifier, but by our effort to make the data interoperable and linked, we can now analyse this data. Our analysis revealed the timeline of rare disease and causative gene discovery and links them to developments in methods. Nature Publishing Group UK 2021-05-04 /pmc/articles/PMC8096966/ /pubmed/33947870 http://dx.doi.org/10.1038/s41597-021-00905-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) applies to the metadata files associated with this article. |
spellingShingle | Data Descriptor Ehrhart, Friederike Willighagen, Egon L. Kutmon, Martina van Hoften, Max Curfs, Leopold M. G. Evelo, Chris T. A resource to explore the discovery of rare diseases and their causative genes |
title | A resource to explore the discovery of rare diseases and their causative genes |
title_full | A resource to explore the discovery of rare diseases and their causative genes |
title_fullStr | A resource to explore the discovery of rare diseases and their causative genes |
title_full_unstemmed | A resource to explore the discovery of rare diseases and their causative genes |
title_short | A resource to explore the discovery of rare diseases and their causative genes |
title_sort | resource to explore the discovery of rare diseases and their causative genes |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096966/ https://www.ncbi.nlm.nih.gov/pubmed/33947870 http://dx.doi.org/10.1038/s41597-021-00905-y |
work_keys_str_mv | AT ehrhartfriederike aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT willighagenegonl aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT kutmonmartina aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT vanhoftenmax aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT curfsleopoldmg aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT evelochrist aresourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT ehrhartfriederike resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT willighagenegonl resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT kutmonmartina resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT vanhoftenmax resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT curfsleopoldmg resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes AT evelochrist resourcetoexplorethediscoveryofrarediseasesandtheircausativegenes |