Cargando…

Diagnosis code assignment: models and evaluation metrics

BACKGROUND AND OBJECTIVE: The volume of healthcare data is growing rapidly with the adoption of health information technology. We focus on automated ICD9 code assignment from discharge summary content and methods for evaluating such assignments. METHODS: We study ICD9 diagnosis codes and discharge s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Perotte, Adler, Pivovarov, Rimma, Natarajan, Karthik, Weiskopf, Nicole, Wood, Frank, Elhadad, Noémie
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BMJ Publishing Group 2014
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3932472/ https://www.ncbi.nlm.nih.gov/pubmed/24296907 http://dx.doi.org/10.1136/amiajnl-2013-002159

_version_	1782304800171360256
author	Perotte, Adler Pivovarov, Rimma Natarajan, Karthik Weiskopf, Nicole Wood, Frank Elhadad, Noémie
author_facet	Perotte, Adler Pivovarov, Rimma Natarajan, Karthik Weiskopf, Nicole Wood, Frank Elhadad, Noémie
author_sort	Perotte, Adler
collection	PubMed
description	BACKGROUND AND OBJECTIVE: The volume of healthcare data is growing rapidly with the adoption of health information technology. We focus on automated ICD9 code assignment from discharge summary content and methods for evaluating such assignments. METHODS: We study ICD9 diagnosis codes and discharge summaries from the publicly available Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC II) repository. We experiment with two coding approaches: one that treats each ICD9 code independently of each other (flat classifier), and one that leverages the hierarchical nature of ICD9 codes into its modeling (hierarchy-based classifier). We propose novel evaluation metrics, which reflect the distances among gold-standard and predicted codes and their locations in the ICD9 tree. Experimental setup, code for modeling, and evaluation scripts are made available to the research community. RESULTS: The hierarchy-based classifier outperforms the flat classifier with F-measures of 39.5% and 27.6%, respectively, when trained on 20 533 documents and tested on 2282 documents. While recall is improved at the expense of precision, our novel evaluation metrics show a more refined assessment: for instance, the hierarchy-based classifier identifies the correct sub-tree of gold-standard codes more often than the flat classifier. Error analysis reveals that gold-standard codes are not perfect, and as such the recall and precision are likely underestimated. CONCLUSIONS: Hierarchy-based classification yields better ICD9 coding than flat classification for MIMIC patients. Automated ICD9 coding is an example of a task for which data and tools can be shared and for which the research community can work together to build on shared models and advance the state of the art.
format	Online Article Text
id	pubmed-3932472
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BMJ Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-39324722014-02-24 Diagnosis code assignment: models and evaluation metrics Perotte, Adler Pivovarov, Rimma Natarajan, Karthik Weiskopf, Nicole Wood, Frank Elhadad, Noémie J Am Med Inform Assoc Research and Applications BACKGROUND AND OBJECTIVE: The volume of healthcare data is growing rapidly with the adoption of health information technology. We focus on automated ICD9 code assignment from discharge summary content and methods for evaluating such assignments. METHODS: We study ICD9 diagnosis codes and discharge summaries from the publicly available Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC II) repository. We experiment with two coding approaches: one that treats each ICD9 code independently of each other (flat classifier), and one that leverages the hierarchical nature of ICD9 codes into its modeling (hierarchy-based classifier). We propose novel evaluation metrics, which reflect the distances among gold-standard and predicted codes and their locations in the ICD9 tree. Experimental setup, code for modeling, and evaluation scripts are made available to the research community. RESULTS: The hierarchy-based classifier outperforms the flat classifier with F-measures of 39.5% and 27.6%, respectively, when trained on 20 533 documents and tested on 2282 documents. While recall is improved at the expense of precision, our novel evaluation metrics show a more refined assessment: for instance, the hierarchy-based classifier identifies the correct sub-tree of gold-standard codes more often than the flat classifier. Error analysis reveals that gold-standard codes are not perfect, and as such the recall and precision are likely underestimated. CONCLUSIONS: Hierarchy-based classification yields better ICD9 coding than flat classification for MIMIC patients. Automated ICD9 coding is an example of a task for which data and tools can be shared and for which the research community can work together to build on shared models and advance the state of the art. BMJ Publishing Group 2014-03 2013-12-02 /pmc/articles/PMC3932472/ /pubmed/24296907 http://dx.doi.org/10.1136/amiajnl-2013-002159 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
spellingShingle	Research and Applications Perotte, Adler Pivovarov, Rimma Natarajan, Karthik Weiskopf, Nicole Wood, Frank Elhadad, Noémie Diagnosis code assignment: models and evaluation metrics
title	Diagnosis code assignment: models and evaluation metrics
title_full	Diagnosis code assignment: models and evaluation metrics
title_fullStr	Diagnosis code assignment: models and evaluation metrics
title_full_unstemmed	Diagnosis code assignment: models and evaluation metrics
title_short	Diagnosis code assignment: models and evaluation metrics
title_sort	diagnosis code assignment: models and evaluation metrics
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3932472/ https://www.ncbi.nlm.nih.gov/pubmed/24296907 http://dx.doi.org/10.1136/amiajnl-2013-002159
work_keys_str_mv	AT perotteadler diagnosiscodeassignmentmodelsandevaluationmetrics AT pivovarovrimma diagnosiscodeassignmentmodelsandevaluationmetrics AT natarajankarthik diagnosiscodeassignmentmodelsandevaluationmetrics AT weiskopfnicole diagnosiscodeassignmentmodelsandevaluationmetrics AT woodfrank diagnosiscodeassignmentmodelsandevaluationmetrics AT elhadadnoemie diagnosiscodeassignmentmodelsandevaluationmetrics

Diagnosis code assignment: models and evaluation metrics

Ejemplares similares