Cargando…

Automated data extraction—A feasible way to construct patient registers of primary care utilization

INTRODUCTION. Electronic medical records (EMRs) enable analysis of health care data by using data mining techniques to build research databases. Though the reliability of the data extraction process is crucial for the credibility of the final analysis, there are few published validations of this pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Martinell, Mats, Stålhammar, Jan, Hallqvist, Johan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Informa Healthcare 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3282243/
https://www.ncbi.nlm.nih.gov/pubmed/22335391
http://dx.doi.org/10.3109/03009734.2011.653015
_version_ 1782224054521954304
author Martinell, Mats
Stålhammar, Jan
Hallqvist, Johan
author_facet Martinell, Mats
Stålhammar, Jan
Hallqvist, Johan
author_sort Martinell, Mats
collection PubMed
description INTRODUCTION. Electronic medical records (EMRs) enable analysis of health care data by using data mining techniques to build research databases. Though the reliability of the data extraction process is crucial for the credibility of the final analysis, there are few published validations of this process. In this paper we validate the performance of an automated data mining tool on EMR in a primary care setting. METHODS. The Pygargus Customized eXtraction Program (CXP) was programmed to find and then extract data from patients meeting criteria for type 2 diabetes mellitus (T2DM) at one primary health care clinic (PHC). The ability of CXP to extract relevant cases was assessed by comparing cases extracted by an EMR integrated search engine. The concordance of extracted data with the original EMR source was manually controlled. RESULTS. Prevalence of T2DM was 4.0%, which correspond well to previous estimations. By searching for drug prescriptions, diagnosis codes, and laboratory values, 38%, 53%, and 91% of relevant cases were found, respectively. The sensitivity of CXP regarding extraction of relevant cases was 100%. The specificity was 99.9% due to 12 non-T2DM cases extracted. The congruity at single-item level was 99.6%. The 13 incorrect data items were all located in the same structural module. CONCLUSION. The CXP is a reliable and accurate data mining tool to extract selective data from EMR.
format Online
Article
Text
id pubmed-3282243
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Informa Healthcare
record_format MEDLINE/PubMed
spelling pubmed-32822432012-03-01 Automated data extraction—A feasible way to construct patient registers of primary care utilization Martinell, Mats Stålhammar, Jan Hallqvist, Johan Ups J Med Sci Original Articles INTRODUCTION. Electronic medical records (EMRs) enable analysis of health care data by using data mining techniques to build research databases. Though the reliability of the data extraction process is crucial for the credibility of the final analysis, there are few published validations of this process. In this paper we validate the performance of an automated data mining tool on EMR in a primary care setting. METHODS. The Pygargus Customized eXtraction Program (CXP) was programmed to find and then extract data from patients meeting criteria for type 2 diabetes mellitus (T2DM) at one primary health care clinic (PHC). The ability of CXP to extract relevant cases was assessed by comparing cases extracted by an EMR integrated search engine. The concordance of extracted data with the original EMR source was manually controlled. RESULTS. Prevalence of T2DM was 4.0%, which correspond well to previous estimations. By searching for drug prescriptions, diagnosis codes, and laboratory values, 38%, 53%, and 91% of relevant cases were found, respectively. The sensitivity of CXP regarding extraction of relevant cases was 100%. The specificity was 99.9% due to 12 non-T2DM cases extracted. The congruity at single-item level was 99.6%. The 13 incorrect data items were all located in the same structural module. CONCLUSION. The CXP is a reliable and accurate data mining tool to extract selective data from EMR. Informa Healthcare 2012-03 2012-02-15 /pmc/articles/PMC3282243/ /pubmed/22335391 http://dx.doi.org/10.3109/03009734.2011.653015 Text en © Informa Healthcare http://creativecommons.org/licenses/by/2.5/ This is an open-access article distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the source is credited.
spellingShingle Original Articles
Martinell, Mats
Stålhammar, Jan
Hallqvist, Johan
Automated data extraction—A feasible way to construct patient registers of primary care utilization
title Automated data extraction—A feasible way to construct patient registers of primary care utilization
title_full Automated data extraction—A feasible way to construct patient registers of primary care utilization
title_fullStr Automated data extraction—A feasible way to construct patient registers of primary care utilization
title_full_unstemmed Automated data extraction—A feasible way to construct patient registers of primary care utilization
title_short Automated data extraction—A feasible way to construct patient registers of primary care utilization
title_sort automated data extraction—a feasible way to construct patient registers of primary care utilization
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3282243/
https://www.ncbi.nlm.nih.gov/pubmed/22335391
http://dx.doi.org/10.3109/03009734.2011.653015
work_keys_str_mv AT martinellmats automateddataextractionafeasiblewaytoconstructpatientregistersofprimarycareutilization
AT stalhammarjan automateddataextractionafeasiblewaytoconstructpatientregistersofprimarycareutilization
AT hallqvistjohan automateddataextractionafeasiblewaytoconstructpatientregistersofprimarycareutilization