Cargando…
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
BACKGROUND: In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other serv...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8612004/ https://www.ncbi.nlm.nih.gov/pubmed/34814905 http://dx.doi.org/10.1186/s12911-021-01665-w |
_version_ | 1784603402670440448 |
---|---|
author | Khaleghi, Tannaz Murat, Alper Arslanturk, Suzan |
author_facet | Khaleghi, Tannaz Murat, Alper Arslanturk, Suzan |
author_sort | Khaleghi, Tannaz |
collection | PubMed |
description | BACKGROUND: In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other services using text features is very sparse. This study improves the prediction of CPTs by the means of informative features and a novel re-prioritization algorithm. METHODS: The input data used in this study is composed of both structured and unstructured data. The ground truth labels (CPTs) are obtained from medical coding databases using relative value units which indicates the major operational procedures in each surgery case. In the modeling process, we first utilize Random Forest multi-class classification model to predict the CPT codes. Second, we extract the key information such as label probabilities, feature importance measures, and medical term frequency. Then, the indicated factors are used in a novel algorithm to rearrange the alternative CPT codes in the list of potential candidates based on the calculated weights. RESULTS: To evaluate the performance of both phases, prediction and complementary improvement, we report the accuracy scores of multi-class CPT prediction tasks for datasets of 5 key surgery case specialities. The Random Forest model performs the classification task with 74–76% when predicting the primary CPT (accuracy@1) versus the CPT set (accuracy@2) with respect to two filtering conditions on CPT codes. The complementary algorithm improves the results from initial step by 8% on average. Furthermore, the incorporated text features enhanced the quality of the output by 20–35%. The model outperforms the state-of-the-art neural network model with respect to accuracy, precision and recall. CONCLUSIONS: We have established a robust framework based on a decision tree predictive model. We predict the surgical codes more accurately and robust compared to the state-of-the-art deep neural structures which can help immensely in both surgery billing and scheduling purposes in such units. |
format | Online Article Text |
id | pubmed-8612004 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86120042021-11-29 A tree based approach for multi-class classification of surgical procedures using structured and unstructured data Khaleghi, Tannaz Murat, Alper Arslanturk, Suzan BMC Med Inform Decis Mak Research BACKGROUND: In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other services using text features is very sparse. This study improves the prediction of CPTs by the means of informative features and a novel re-prioritization algorithm. METHODS: The input data used in this study is composed of both structured and unstructured data. The ground truth labels (CPTs) are obtained from medical coding databases using relative value units which indicates the major operational procedures in each surgery case. In the modeling process, we first utilize Random Forest multi-class classification model to predict the CPT codes. Second, we extract the key information such as label probabilities, feature importance measures, and medical term frequency. Then, the indicated factors are used in a novel algorithm to rearrange the alternative CPT codes in the list of potential candidates based on the calculated weights. RESULTS: To evaluate the performance of both phases, prediction and complementary improvement, we report the accuracy scores of multi-class CPT prediction tasks for datasets of 5 key surgery case specialities. The Random Forest model performs the classification task with 74–76% when predicting the primary CPT (accuracy@1) versus the CPT set (accuracy@2) with respect to two filtering conditions on CPT codes. The complementary algorithm improves the results from initial step by 8% on average. Furthermore, the incorporated text features enhanced the quality of the output by 20–35%. The model outperforms the state-of-the-art neural network model with respect to accuracy, precision and recall. CONCLUSIONS: We have established a robust framework based on a decision tree predictive model. We predict the surgical codes more accurately and robust compared to the state-of-the-art deep neural structures which can help immensely in both surgery billing and scheduling purposes in such units. BioMed Central 2021-11-23 /pmc/articles/PMC8612004/ /pubmed/34814905 http://dx.doi.org/10.1186/s12911-021-01665-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Khaleghi, Tannaz Murat, Alper Arslanturk, Suzan A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title | A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_full | A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_fullStr | A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_full_unstemmed | A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_short | A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_sort | tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8612004/ https://www.ncbi.nlm.nih.gov/pubmed/34814905 http://dx.doi.org/10.1186/s12911-021-01665-w |
work_keys_str_mv | AT khaleghitannaz atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT muratalper atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT arslanturksuzan atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT khaleghitannaz treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT muratalper treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT arslanturksuzan treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata |