Cargando…

Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes

The mapping of human-entered data to codified data formats that can be analysed is a common problem across medical research and health care. To identify risk and protective factors for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) susceptibility and coronavirus disease 2019 (COVID-19)...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kellmann, Alexander J, Lanting, Pauline, Franke, Lude, van Enckevort, Esther J, Swertz, Morris A
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132814/ https://www.ncbi.nlm.nih.gov/pubmed/37114804 http://dx.doi.org/10.1093/database/baad019

_version_	1785031467566366720
author	Kellmann, Alexander J Lanting, Pauline Franke, Lude van Enckevort, Esther J Swertz, Morris A
author_facet	Kellmann, Alexander J Lanting, Pauline Franke, Lude van Enckevort, Esther J Swertz, Morris A
author_sort	Kellmann, Alexander J
collection	PubMed
description	The mapping of human-entered data to codified data formats that can be analysed is a common problem across medical research and health care. To identify risk and protective factors for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) susceptibility and coronavirus disease 2019 (COVID-19) severity, frequent questionnaires were sent out to participants of the Lifelines Cohort Study starting 30 March 2020. Because specific drugs were suspected COVID-19 risk factors, the questionnaires contained multiple-choice questions about commonly used drugs and open-ended questions to capture all other drugs used. To classify and evaluate the effects of those drugs and group participants taking similar drugs, the free-text answers needed to be translated into standard Anatomical Therapeutic Chemical (ATC) codes. This translation includes handling misspelt drug names, brand names, comments or multiple drugs listed in one line that would prevent a computer from finding these terms in a simple lookup table. In the past, the translation of free-text responses to ATC codes was time-intensive manual labour for experts. To reduce the amount of manual curation required, we developed a method for the semi-automated recoding of the free-text questionnaire responses into ATC codes suitable for further analysis. For this purpose, we built an ontology containing the Dutch drug names linked to their respective ATC code(s). In addition, we designed a semi-automated process that builds upon the Molgenis method SORTA to map the responses to ATC codes. This method can be applied to support the encoding of free-text responses to facilitate the evaluation, categorization and filtering of free-text responses. Our semi-automatic approach to coding of drugs using SORTA turned out to be more than two times faster than current manual approaches to performing this activity. Database URL https://doi.org/10.1093/database/baad019
format	Online Article Text
id	pubmed-10132814
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-101328142023-04-27 Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes Kellmann, Alexander J Lanting, Pauline Franke, Lude van Enckevort, Esther J Swertz, Morris A Database (Oxford) Original Article The mapping of human-entered data to codified data formats that can be analysed is a common problem across medical research and health care. To identify risk and protective factors for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) susceptibility and coronavirus disease 2019 (COVID-19) severity, frequent questionnaires were sent out to participants of the Lifelines Cohort Study starting 30 March 2020. Because specific drugs were suspected COVID-19 risk factors, the questionnaires contained multiple-choice questions about commonly used drugs and open-ended questions to capture all other drugs used. To classify and evaluate the effects of those drugs and group participants taking similar drugs, the free-text answers needed to be translated into standard Anatomical Therapeutic Chemical (ATC) codes. This translation includes handling misspelt drug names, brand names, comments or multiple drugs listed in one line that would prevent a computer from finding these terms in a simple lookup table. In the past, the translation of free-text responses to ATC codes was time-intensive manual labour for experts. To reduce the amount of manual curation required, we developed a method for the semi-automated recoding of the free-text questionnaire responses into ATC codes suitable for further analysis. For this purpose, we built an ontology containing the Dutch drug names linked to their respective ATC code(s). In addition, we designed a semi-automated process that builds upon the Molgenis method SORTA to map the responses to ATC codes. This method can be applied to support the encoding of free-text responses to facilitate the evaluation, categorization and filtering of free-text responses. Our semi-automatic approach to coding of drugs using SORTA turned out to be more than two times faster than current manual approaches to performing this activity. Database URL https://doi.org/10.1093/database/baad019 Oxford University Press 2023-04-26 /pmc/articles/PMC10132814/ /pubmed/37114804 http://dx.doi.org/10.1093/database/baad019 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Article Kellmann, Alexander J Lanting, Pauline Franke, Lude van Enckevort, Esther J Swertz, Morris A Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes
title	Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes
title_full	Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes
title_fullStr	Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes
title_full_unstemmed	Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes
title_short	Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes
title_sort	semi-automatic translation of medicine usage data (in dutch, free-text) from lifelines covid-19 questionnaires to atc codes
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132814/ https://www.ncbi.nlm.nih.gov/pubmed/37114804 http://dx.doi.org/10.1093/database/baad019
work_keys_str_mv	AT kellmannalexanderj semiautomatictranslationofmedicineusagedataindutchfreetextfromlifelinescovid19questionnairestoatccodes AT lantingpauline semiautomatictranslationofmedicineusagedataindutchfreetextfromlifelinescovid19questionnairestoatccodes AT frankelude semiautomatictranslationofmedicineusagedataindutchfreetextfromlifelinescovid19questionnairestoatccodes AT vanenckevortestherj semiautomatictranslationofmedicineusagedataindutchfreetextfromlifelinescovid19questionnairestoatccodes AT swertzmorrisa semiautomatictranslationofmedicineusagedataindutchfreetextfromlifelinescovid19questionnairestoatccodes

Semi-automatic translation of medicine usage data (in Dutch, free-text) from Lifelines COVID-19 questionnaires to ATC codes

Ejemplares similares