Cargando…

An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation

BACKGROUND: Medical coding is the process that converts clinical documentation into standard medical codes. Codes are used for several key purposes in a hospital (eg, insurance reimbursement and performance analysis); therefore, their optimization is crucial. With the rapid growth of natural languag...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xu, He Ayu, Maccari, Bernard, Guillain, Hervé, Herzen, Julien, Agri, Fabio, Raisaro, Jean Louis
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9896350/ https://www.ncbi.nlm.nih.gov/pubmed/36656627 http://dx.doi.org/10.2196/38150

_version_	1784882039471734784
author	Xu, He Ayu Maccari, Bernard Guillain, Hervé Herzen, Julien Agri, Fabio Raisaro, Jean Louis
author_facet	Xu, He Ayu Maccari, Bernard Guillain, Hervé Herzen, Julien Agri, Fabio Raisaro, Jean Louis
author_sort	Xu, He Ayu
collection	PubMed
description	BACKGROUND: Medical coding is the process that converts clinical documentation into standard medical codes. Codes are used for several key purposes in a hospital (eg, insurance reimbursement and performance analysis); therefore, their optimization is crucial. With the rapid growth of natural language processing technologies, several solutions based on artificial intelligence have been proposed to aid in medical coding by automatically suggesting relevant codes for clinical documents. However, their effectiveness is still limited to simple cases, and it is not yet clear how much value they can bring in improving coding efficiency and accuracy. OBJECTIVE: This study aimed to bring more efficiency to the coding process to improve the selection of codes by medical coders. To achieve this, we developed an innovative multimodal machine learning–based solution that, instead of predicting codes, detects the degree of coding complexity before coding is performed. The notion of coding complexity was used to better dispatch work among medical coders to eventually minimize errors and improve throughput. METHODS: To train and evaluate our approach, we collected 2060 cases rated by coders in terms of coding complexity from 1 (simplest) to 4 (most complex). We asked 2 expert coders to rate 3.01% (62/2060) of the cases as the gold standard. The agreements between experts were used as benchmarks for model evaluation. A case contains both clinical text and patient metadata from the hospital electronic health record. We extracted both text features and metadata features, then concatenated and fed them into several machine learning models. Finally, we selected 2 models. The first used cross-validated training on 1751 cases and testing on 309 cases aiming to assess the predictive power of the proposed approach and its generalizability. The second model was trained on 1998 cases and tested on the gold standard to validate the best model performance against human benchmarks. RESULTS: Our first model achieved a macro–F(1)-score of 0.51 and an accuracy of 0.59 on classifying the 4-scale complexity. The model distinguished well between the simple (combined complexity 1-2) and complex (combined complexity 3-4) cases with a macro–F(1)-score of 0.65 and an accuracy of 0.71. Our second model achieved 61% agreement with experts’ ratings and a macro–F(1)-score of 0.62 on the gold standard, whereas the 2 experts had a 66% (41/62) agreement ratio with a macro–F(1)-score of 0.67. CONCLUSIONS: We propose a multimodal machine learning approach that leverages information from both clinical text and patient metadata to predict the complexity of coding a case in the precoding phase. By integrating this model into the hospital coding system, distribution of cases among coders can be done automatically with performance comparable with that of human expert coders, thus improving coding efficiency and accuracy at scale.
format	Online Article Text
id	pubmed-9896350
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-98963502023-02-04 An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation Xu, He Ayu Maccari, Bernard Guillain, Hervé Herzen, Julien Agri, Fabio Raisaro, Jean Louis JMIR Med Inform Original Paper BACKGROUND: Medical coding is the process that converts clinical documentation into standard medical codes. Codes are used for several key purposes in a hospital (eg, insurance reimbursement and performance analysis); therefore, their optimization is crucial. With the rapid growth of natural language processing technologies, several solutions based on artificial intelligence have been proposed to aid in medical coding by automatically suggesting relevant codes for clinical documents. However, their effectiveness is still limited to simple cases, and it is not yet clear how much value they can bring in improving coding efficiency and accuracy. OBJECTIVE: This study aimed to bring more efficiency to the coding process to improve the selection of codes by medical coders. To achieve this, we developed an innovative multimodal machine learning–based solution that, instead of predicting codes, detects the degree of coding complexity before coding is performed. The notion of coding complexity was used to better dispatch work among medical coders to eventually minimize errors and improve throughput. METHODS: To train and evaluate our approach, we collected 2060 cases rated by coders in terms of coding complexity from 1 (simplest) to 4 (most complex). We asked 2 expert coders to rate 3.01% (62/2060) of the cases as the gold standard. The agreements between experts were used as benchmarks for model evaluation. A case contains both clinical text and patient metadata from the hospital electronic health record. We extracted both text features and metadata features, then concatenated and fed them into several machine learning models. Finally, we selected 2 models. The first used cross-validated training on 1751 cases and testing on 309 cases aiming to assess the predictive power of the proposed approach and its generalizability. The second model was trained on 1998 cases and tested on the gold standard to validate the best model performance against human benchmarks. RESULTS: Our first model achieved a macro–F(1)-score of 0.51 and an accuracy of 0.59 on classifying the 4-scale complexity. The model distinguished well between the simple (combined complexity 1-2) and complex (combined complexity 3-4) cases with a macro–F(1)-score of 0.65 and an accuracy of 0.71. Our second model achieved 61% agreement with experts’ ratings and a macro–F(1)-score of 0.62 on the gold standard, whereas the 2 experts had a 66% (41/62) agreement ratio with a macro–F(1)-score of 0.67. CONCLUSIONS: We propose a multimodal machine learning approach that leverages information from both clinical text and patient metadata to predict the complexity of coding a case in the precoding phase. By integrating this model into the hospital coding system, distribution of cases among coders can be done automatically with performance comparable with that of human expert coders, thus improving coding efficiency and accuracy at scale. JMIR Publications 2023-01-19 /pmc/articles/PMC9896350/ /pubmed/36656627 http://dx.doi.org/10.2196/38150 Text en ©He Ayu Xu, Bernard Maccari, Hervé Guillain, Julien Herzen, Fabio Agri, Jean Louis Raisaro. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 19.01.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Xu, He Ayu Maccari, Bernard Guillain, Hervé Herzen, Julien Agri, Fabio Raisaro, Jean Louis An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation
title	An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation
title_full	An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation
title_fullStr	An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation
title_full_unstemmed	An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation
title_short	An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation
title_sort	end-to-end natural language processing application for prediction of medical case coding complexity: algorithm development and validation
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9896350/ https://www.ncbi.nlm.nih.gov/pubmed/36656627 http://dx.doi.org/10.2196/38150
work_keys_str_mv	AT xuheayu anendtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT maccaribernard anendtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT guillainherve anendtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT herzenjulien anendtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT agrifabio anendtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT raisarojeanlouis anendtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT xuheayu endtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT maccaribernard endtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT guillainherve endtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT herzenjulien endtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT agrifabio endtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation AT raisarojeanlouis endtoendnaturallanguageprocessingapplicationforpredictionofmedicalcasecodingcomplexityalgorithmdevelopmentandvalidation

An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation

Ejemplares similares