Cargando…

BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration

MOTIVATION: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. RESULTS: To overcome this, we developed a new matching algorithm that identifies pairs of relat...

Descripción completa

Detalles Bibliográficos
Autores principales: Pang, Chao, Kelpin, Fleur, van Enckevort, David, Eklund, Niina, Silander, Kaisa, Hendriksen, Dennis, de Haan, Mark, Jetten, Jonathan, de Boer, Tommy, Charbon, Bart, Holub, Petr, Hillege, Hans, Swertz, Morris A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870622/
https://www.ncbi.nlm.nih.gov/pubmed/29036577
http://dx.doi.org/10.1093/bioinformatics/btx478
_version_ 1783309520300146688
author Pang, Chao
Kelpin, Fleur
van Enckevort, David
Eklund, Niina
Silander, Kaisa
Hendriksen, Dennis
de Haan, Mark
Jetten, Jonathan
de Boer, Tommy
Charbon, Bart
Holub, Petr
Hillege, Hans
Swertz, Morris A
author_facet Pang, Chao
Kelpin, Fleur
van Enckevort, David
Eklund, Niina
Silander, Kaisa
Hendriksen, Dennis
de Haan, Mark
Jetten, Jonathan
de Boer, Tommy
Charbon, Bart
Holub, Petr
Hillege, Hans
Swertz, Morris A
author_sort Pang, Chao
collection PubMed
description MOTIVATION: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. RESULTS: To overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes. AVAILABILITY AND IMPLEMENTATION: BiobankUniverse is available at http://biobankuniverse.com or can be downloaded as part of the open source MOLGENIS suite at http://github.com/molgenis/molgenis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5870622
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58706222018-04-05 BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration Pang, Chao Kelpin, Fleur van Enckevort, David Eklund, Niina Silander, Kaisa Hendriksen, Dennis de Haan, Mark Jetten, Jonathan de Boer, Tommy Charbon, Bart Holub, Petr Hillege, Hans Swertz, Morris A Bioinformatics Original Papers MOTIVATION: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. RESULTS: To overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes. AVAILABILITY AND IMPLEMENTATION: BiobankUniverse is available at http://biobankuniverse.com or can be downloaded as part of the open source MOLGENIS suite at http://github.com/molgenis/molgenis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-11-15 2017-08-02 /pmc/articles/PMC5870622/ /pubmed/29036577 http://dx.doi.org/10.1093/bioinformatics/btx478 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Pang, Chao
Kelpin, Fleur
van Enckevort, David
Eklund, Niina
Silander, Kaisa
Hendriksen, Dennis
de Haan, Mark
Jetten, Jonathan
de Boer, Tommy
Charbon, Bart
Holub, Petr
Hillege, Hans
Swertz, Morris A
BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
title BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
title_full BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
title_fullStr BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
title_full_unstemmed BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
title_short BiobankUniverse: automatic matchmaking between datasets for biobank data discovery and integration
title_sort biobankuniverse: automatic matchmaking between datasets for biobank data discovery and integration
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870622/
https://www.ncbi.nlm.nih.gov/pubmed/29036577
http://dx.doi.org/10.1093/bioinformatics/btx478
work_keys_str_mv AT pangchao biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT kelpinfleur biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT vanenckevortdavid biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT eklundniina biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT silanderkaisa biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT hendriksendennis biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT dehaanmark biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT jettenjonathan biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT deboertommy biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT charbonbart biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT holubpetr biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT hillegehans biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration
AT swertzmorrisa biobankuniverseautomaticmatchmakingbetweendatasetsforbiobankdatadiscoveryandintegration