Cargando…

Integrating terminologies into standard SQL: a new approach for research on routine data

BACKGROUND: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. METHODS: We present an approach that allows databases accessible...

Descripción completa

Detalles Bibliográficos
Autores principales: Sander, André, Wauer, Roland
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480592/
https://www.ncbi.nlm.nih.gov/pubmed/31014403
http://dx.doi.org/10.1186/s13326-019-0199-z
_version_ 1783413601009139712
author Sander, André
Wauer, Roland
author_facet Sander, André
Wauer, Roland
author_sort Sander, André
collection PubMed
description BACKGROUND: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. METHODS: We present an approach that allows databases accessible via SQL (Structured Query Language) to be searched directly through semantic queries without the need for further transformations. Therefore, we developed I) an extension to SQL named Ontology-SQL (O-SQL) that allows to use semantic expressions, II) a framework that uses a standard terminology server to annotate free-text containing database tables and III) a parser that rewrites O-SQL to SQL, so that such queries can be passed to the database server. RESULTS: I) We compared several semantic queries published to date and were able to reproduce them in a reduced, highly condensed form. II) The quality of the annotation process was measured against manual annotation, and we found a sensitivity of 97.62% and a specificity of 100.00%. III) Different semantic queries were analyzed, and measured with F-scores between 0.91 and 0.98. CONCLUSIONS: We showed that systematic analysis of free-text-containing medical records is possible with standard tools. The seamless connection of ontologies and standard technologies from the database field represents an important constituent of unstructured data analysis. The developed technology can be readily applied to relationally organized data and supports the increasingly important field of translational research. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13326-019-0199-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6480592
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64805922019-05-01 Integrating terminologies into standard SQL: a new approach for research on routine data Sander, André Wauer, Roland J Biomed Semantics Research BACKGROUND: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. METHODS: We present an approach that allows databases accessible via SQL (Structured Query Language) to be searched directly through semantic queries without the need for further transformations. Therefore, we developed I) an extension to SQL named Ontology-SQL (O-SQL) that allows to use semantic expressions, II) a framework that uses a standard terminology server to annotate free-text containing database tables and III) a parser that rewrites O-SQL to SQL, so that such queries can be passed to the database server. RESULTS: I) We compared several semantic queries published to date and were able to reproduce them in a reduced, highly condensed form. II) The quality of the annotation process was measured against manual annotation, and we found a sensitivity of 97.62% and a specificity of 100.00%. III) Different semantic queries were analyzed, and measured with F-scores between 0.91 and 0.98. CONCLUSIONS: We showed that systematic analysis of free-text-containing medical records is possible with standard tools. The seamless connection of ontologies and standard technologies from the database field represents an important constituent of unstructured data analysis. The developed technology can be readily applied to relationally organized data and supports the increasingly important field of translational research. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13326-019-0199-z) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-24 /pmc/articles/PMC6480592/ /pubmed/31014403 http://dx.doi.org/10.1186/s13326-019-0199-z Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Sander, André
Wauer, Roland
Integrating terminologies into standard SQL: a new approach for research on routine data
title Integrating terminologies into standard SQL: a new approach for research on routine data
title_full Integrating terminologies into standard SQL: a new approach for research on routine data
title_fullStr Integrating terminologies into standard SQL: a new approach for research on routine data
title_full_unstemmed Integrating terminologies into standard SQL: a new approach for research on routine data
title_short Integrating terminologies into standard SQL: a new approach for research on routine data
title_sort integrating terminologies into standard sql: a new approach for research on routine data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480592/
https://www.ncbi.nlm.nih.gov/pubmed/31014403
http://dx.doi.org/10.1186/s13326-019-0199-z
work_keys_str_mv AT sanderandre integratingterminologiesintostandardsqlanewapproachforresearchonroutinedata
AT wauerroland integratingterminologiesintostandardsqlanewapproachforresearchonroutinedata