Cargando…
BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
MOTIVATION: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. RESULTS: To overcome this, we developed a new matching algorithm that identifies pairs of relat...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870622/ https://www.ncbi.nlm.nih.gov/pubmed/29036577 http://dx.doi.org/10.1093/bioinformatics/btx478 |
_version_ | 1783309520300146688 |
---|---|
author | Pang, Chao Kelpin, Fleur van Enckevort, David Eklund, Niina Silander, Kaisa Hendriksen, Dennis de Haan, Mark Jetten, Jonathan de Boer, Tommy Charbon, Bart Holub, Petr Hillege, Hans Swertz, Morris A |
author_facet | Pang, Chao Kelpin, Fleur van Enckevort, David Eklund, Niina Silander, Kaisa Hendriksen, Dennis de Haan, Mark Jetten, Jonathan de Boer, Tommy Charbon, Bart Holub, Petr Hillege, Hans Swertz, Morris A |
author_sort | Pang, Chao |
collection | PubMed |
description | MOTIVATION: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. RESULTS: To overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes. AVAILABILITY AND IMPLEMENTATION: BiobankUniverse is available at http://biobankuniverse.com or can be downloaded as part of the open source MOLGENIS suite at http://github.com/molgenis/molgenis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5870622 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58706222018-04-05 BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration Pang, Chao Kelpin, Fleur van Enckevort, David Eklund, Niina Silander, Kaisa Hendriksen, Dennis de Haan, Mark Jetten, Jonathan de Boer, Tommy Charbon, Bart Holub, Petr Hillege, Hans Swertz, Morris A Bioinformatics Original Papers MOTIVATION: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. RESULTS: To overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes. AVAILABILITY AND IMPLEMENTATION: BiobankUniverse is available at http://biobankuniverse.com or can be downloaded as part of the open source MOLGENIS suite at http://github.com/molgenis/molgenis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-11-15 2017-08-02 /pmc/articles/PMC5870622/ /pubmed/29036577 http://dx.doi.org/10.1093/bioinformatics/btx478 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Papers Pang, Chao Kelpin, Fleur van Enckevort, David Eklund, Niina Silander, Kaisa Hendriksen, Dennis de Haan, Mark Jetten, Jonathan de Boer, Tommy Charbon, Bart Holub, Petr Hillege, Hans Swertz, Morris A BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration |
title | BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration |
title_full | BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration |
title_fullStr | BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration |
title_full_unstemmed | BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration |
title_short | BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration |
title_sort | biobankuniverse: automatic matchmaking between datasets for biobank data discovery and integration |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870622/ https://www.ncbi.nlm.nih.gov/pubmed/29036577 http://dx.doi.org/10.1093/bioinformatics/btx478 |
work_keys_str_mv | AT pangchao biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT kelpinfleur biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT vanenckevortdavid biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT eklundniina biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT silanderkaisa biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT hendriksendennis biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT dehaanmark biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT jettenjonathan biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT deboertommy biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT charbonbart biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT holubpetr biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT hillegehans biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration AT swertzmorrisa biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration |