Cargando…

Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system

BACKGROUND: The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease. METHODS: The principal diagnosis, co-morbidity and smokin...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Qing T, Goryachev, Sergey, Weiss, Scott, Sordo, Margarita, Murphy, Shawn N, Lazarus, Ross
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1553439/
https://www.ncbi.nlm.nih.gov/pubmed/16872495
http://dx.doi.org/10.1186/1472-6947-6-30
_version_ 1782129346634317824
author Zeng, Qing T
Goryachev, Sergey
Weiss, Scott
Sordo, Margarita
Murphy, Shawn N
Lazarus, Ross
author_facet Zeng, Qing T
Goryachev, Sergey
Weiss, Scott
Sordo, Margarita
Murphy, Shawn N
Lazarus, Ross
author_sort Zeng, Qing T
collection PubMed
description BACKGROUND: The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease. METHODS: The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard. RESULTS: The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded. CONCLUSION: We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.
format Text
id pubmed-1553439
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15534392006-08-25 Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system Zeng, Qing T Goryachev, Sergey Weiss, Scott Sordo, Margarita Murphy, Shawn N Lazarus, Ross BMC Med Inform Decis Mak Research Article BACKGROUND: The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease. METHODS: The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard. RESULTS: The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded. CONCLUSION: We consider the results promising, given the complexity of the discharge summaries and the extraction tasks. BioMed Central 2006-07-26 /pmc/articles/PMC1553439/ /pubmed/16872495 http://dx.doi.org/10.1186/1472-6947-6-30 Text en Copyright © 2006 Zeng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zeng, Qing T
Goryachev, Sergey
Weiss, Scott
Sordo, Margarita
Murphy, Shawn N
Lazarus, Ross
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_full Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_fullStr Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_full_unstemmed Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_short Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_sort extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1553439/
https://www.ncbi.nlm.nih.gov/pubmed/16872495
http://dx.doi.org/10.1186/1472-6947-6-30
work_keys_str_mv AT zengqingt extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT goryachevsergey extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT weissscott extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT sordomargarita extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT murphyshawnn extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT lazarusross extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem