Cargando…
Integrating terminologies into standard SQL: a new approach for research on routine data
BACKGROUND: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. METHODS: We present an approach that allows databases accessible...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480592/ https://www.ncbi.nlm.nih.gov/pubmed/31014403 http://dx.doi.org/10.1186/s13326-019-0199-z |
_version_ | 1783413601009139712 |
---|---|
author | Sander, André Wauer, Roland |
author_facet | Sander, André Wauer, Roland |
author_sort | Sander, André |
collection | PubMed |
description | BACKGROUND: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. METHODS: We present an approach that allows databases accessible via SQL (Structured Query Language) to be searched directly through semantic queries without the need for further transformations. Therefore, we developed I) an extension to SQL named Ontology-SQL (O-SQL) that allows to use semantic expressions, II) a framework that uses a standard terminology server to annotate free-text containing database tables and III) a parser that rewrites O-SQL to SQL, so that such queries can be passed to the database server. RESULTS: I) We compared several semantic queries published to date and were able to reproduce them in a reduced, highly condensed form. II) The quality of the annotation process was measured against manual annotation, and we found a sensitivity of 97.62% and a specificity of 100.00%. III) Different semantic queries were analyzed, and measured with F-scores between 0.91 and 0.98. CONCLUSIONS: We showed that systematic analysis of free-text-containing medical records is possible with standard tools. The seamless connection of ontologies and standard technologies from the database field represents an important constituent of unstructured data analysis. The developed technology can be readily applied to relationally organized data and supports the increasingly important field of translational research. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13326-019-0199-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6480592 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64805922019-05-01 Integrating terminologies into standard SQL: a new approach for research on routine data Sander, André Wauer, Roland J Biomed Semantics Research BACKGROUND: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. METHODS: We present an approach that allows databases accessible via SQL (Structured Query Language) to be searched directly through semantic queries without the need for further transformations. Therefore, we developed I) an extension to SQL named Ontology-SQL (O-SQL) that allows to use semantic expressions, II) a framework that uses a standard terminology server to annotate free-text containing database tables and III) a parser that rewrites O-SQL to SQL, so that such queries can be passed to the database server. RESULTS: I) We compared several semantic queries published to date and were able to reproduce them in a reduced, highly condensed form. II) The quality of the annotation process was measured against manual annotation, and we found a sensitivity of 97.62% and a specificity of 100.00%. III) Different semantic queries were analyzed, and measured with F-scores between 0.91 and 0.98. CONCLUSIONS: We showed that systematic analysis of free-text-containing medical records is possible with standard tools. The seamless connection of ontologies and standard technologies from the database field represents an important constituent of unstructured data analysis. The developed technology can be readily applied to relationally organized data and supports the increasingly important field of translational research. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13326-019-0199-z) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-24 /pmc/articles/PMC6480592/ /pubmed/31014403 http://dx.doi.org/10.1186/s13326-019-0199-z Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Sander, André Wauer, Roland Integrating terminologies into standard SQL: a new approach for research on routine data |
title | Integrating terminologies into standard SQL: a new approach for research on routine data |
title_full | Integrating terminologies into standard SQL: a new approach for research on routine data |
title_fullStr | Integrating terminologies into standard SQL: a new approach for research on routine data |
title_full_unstemmed | Integrating terminologies into standard SQL: a new approach for research on routine data |
title_short | Integrating terminologies into standard SQL: a new approach for research on routine data |
title_sort | integrating terminologies into standard sql: a new approach for research on routine data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480592/ https://www.ncbi.nlm.nih.gov/pubmed/31014403 http://dx.doi.org/10.1186/s13326-019-0199-z |
work_keys_str_mv | AT sanderandre integratingterminologiesintostandardsqlanewapproachforresearchonroutinedata AT wauerroland integratingterminologiesintostandardsqlanewapproachforresearchonroutinedata |