Cargando…

A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data

This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from imp...

Descripción completa

Detalles Bibliográficos
Autores principales: Delussu, Giovanni, Lianas, Luca, Frexia, Francesca, Zanetti, Gianluigi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5148592/
https://www.ncbi.nlm.nih.gov/pubmed/27936191
http://dx.doi.org/10.1371/journal.pone.0168004
_version_ 1782473867664556032
author Delussu, Giovanni
Lianas, Luca
Frexia, Francesca
Zanetti, Gianluigi
author_facet Delussu, Giovanni
Lianas, Luca
Frexia, Francesca
Zanetti, Gianluigi
author_sort Delussu, Giovanni
collection PubMed
description This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.
format Online
Article
Text
id pubmed-5148592
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-51485922016-12-28 A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data Delussu, Giovanni Lianas, Luca Frexia, Francesca Zanetti, Gianluigi PLoS One Research Article This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes. Public Library of Science 2016-12-09 /pmc/articles/PMC5148592/ /pubmed/27936191 http://dx.doi.org/10.1371/journal.pone.0168004 Text en © 2016 Delussu et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Delussu, Giovanni
Lianas, Luca
Frexia, Francesca
Zanetti, Gianluigi
A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data
title A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data
title_full A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data
title_fullStr A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data
title_full_unstemmed A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data
title_short A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data
title_sort scalable data access layer to manage structured heterogeneous biomedical data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5148592/
https://www.ncbi.nlm.nih.gov/pubmed/27936191
http://dx.doi.org/10.1371/journal.pone.0168004
work_keys_str_mv AT delussugiovanni ascalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT lianasluca ascalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT frexiafrancesca ascalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT zanettigianluigi ascalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT delussugiovanni scalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT lianasluca scalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT frexiafrancesca scalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata
AT zanettigianluigi scalabledataaccesslayertomanagestructuredheterogeneousbiomedicaldata