Cargando…
Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency
BACKGROUND: Exocrine pancreatic insufficiency (EPI) is a serious condition characterized by a lack of functional exocrine pancreatic enzymes and the resultant inability to properly digest nutrients. EPI can be caused by a variety of disorders, including chronic pancreatitis, pancreatic cancer, and c...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Columbia Data Analytics, LLC
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299452/ https://www.ncbi.nlm.nih.gov/pubmed/32685578 http://dx.doi.org/10.36469/9727 |
_version_ | 1783547390113873920 |
---|---|
author | Pyenson, Bruce Alston, Maggie Gomberg, Jeffrey Han, Feng Khandelwal, Nikhil Dei, Motoharu Son, Monica Vora, Jaime |
author_facet | Pyenson, Bruce Alston, Maggie Gomberg, Jeffrey Han, Feng Khandelwal, Nikhil Dei, Motoharu Son, Monica Vora, Jaime |
author_sort | Pyenson, Bruce |
collection | PubMed |
description | BACKGROUND: Exocrine pancreatic insufficiency (EPI) is a serious condition characterized by a lack of functional exocrine pancreatic enzymes and the resultant inability to properly digest nutrients. EPI can be caused by a variety of disorders, including chronic pancreatitis, pancreatic cancer, and celiac disease. EPI remains underdiagnosed because of the nonspecific nature of clinical symptoms, lack of an ideal diagnostic test, and the inability to easily identify affected patients using administrative claims data. OBJECTIVES: To develop a machine learning model that identifies patients in a commercial medical claims database who likely have EPI but are undiagnosed. METHODS: A machine learning algorithm was developed in Scikit-learn, a Python module. The study population, selected from the 2014 Truven MarketScan® Commercial Claims Database, consisted of patients with EPI-prone conditions. Patients were labeled with 290 condition category flags and split into actual positive EPI cases, actual negative EPI cases, and unlabeled cases. The study population was then randomly divided into a training subset and a testing subset. The training subset was used to determine the performance metrics of 27 models and to select the highest performing model, and the testing subset was used to evaluate performance of the best machine learning model. RESULTS: The study population consisted of 2088 actual positive EPI cases, 1077 actual negative EPI cases, and 437 530 unlabeled cases. In the best performing model, the precision, recall, and accuracy were 0.91, 0.80, and 0.86, respectively. The best-performing model estimated that the number of patients likely to have EPI was about 12 times the number of patients directly identified as EPI-positive through a claims analysis in the study population. The most important features in assigning EPI probability were the presence or absence of diagnosis codes related to pancreatic and digestive conditions. CONCLUSIONS: Machine learning techniques demonstrated high predictive power in identifying patients with EPI and could facilitate an enhanced understanding of its etiology and help to identify patients for possible diagnosis and treatment. |
format | Online Article Text |
id | pubmed-7299452 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Columbia Data Analytics, LLC |
record_format | MEDLINE/PubMed |
spelling | pubmed-72994522020-07-16 Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency Pyenson, Bruce Alston, Maggie Gomberg, Jeffrey Han, Feng Khandelwal, Nikhil Dei, Motoharu Son, Monica Vora, Jaime J Health Econ Outcomes Res Methodology and Health Care Policy BACKGROUND: Exocrine pancreatic insufficiency (EPI) is a serious condition characterized by a lack of functional exocrine pancreatic enzymes and the resultant inability to properly digest nutrients. EPI can be caused by a variety of disorders, including chronic pancreatitis, pancreatic cancer, and celiac disease. EPI remains underdiagnosed because of the nonspecific nature of clinical symptoms, lack of an ideal diagnostic test, and the inability to easily identify affected patients using administrative claims data. OBJECTIVES: To develop a machine learning model that identifies patients in a commercial medical claims database who likely have EPI but are undiagnosed. METHODS: A machine learning algorithm was developed in Scikit-learn, a Python module. The study population, selected from the 2014 Truven MarketScan® Commercial Claims Database, consisted of patients with EPI-prone conditions. Patients were labeled with 290 condition category flags and split into actual positive EPI cases, actual negative EPI cases, and unlabeled cases. The study population was then randomly divided into a training subset and a testing subset. The training subset was used to determine the performance metrics of 27 models and to select the highest performing model, and the testing subset was used to evaluate performance of the best machine learning model. RESULTS: The study population consisted of 2088 actual positive EPI cases, 1077 actual negative EPI cases, and 437 530 unlabeled cases. In the best performing model, the precision, recall, and accuracy were 0.91, 0.80, and 0.86, respectively. The best-performing model estimated that the number of patients likely to have EPI was about 12 times the number of patients directly identified as EPI-positive through a claims analysis in the study population. The most important features in assigning EPI probability were the presence or absence of diagnosis codes related to pancreatic and digestive conditions. CONCLUSIONS: Machine learning techniques demonstrated high predictive power in identifying patients with EPI and could facilitate an enhanced understanding of its etiology and help to identify patients for possible diagnosis and treatment. Columbia Data Analytics, LLC 2019-02-14 /pmc/articles/PMC7299452/ /pubmed/32685578 http://dx.doi.org/10.36469/9727 Text en This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CCBY-4.0). View this license’s legal deed at http://creativecommons.org/licenses/by/4.0 and legal code at http://creativecommons.org/licenses/by/4.0/legalcode for more information. |
spellingShingle | Methodology and Health Care Policy Pyenson, Bruce Alston, Maggie Gomberg, Jeffrey Han, Feng Khandelwal, Nikhil Dei, Motoharu Son, Monica Vora, Jaime Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency |
title | Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency |
title_full | Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency |
title_fullStr | Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency |
title_full_unstemmed | Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency |
title_short | Applying Machine Learning Techniques to Identify Undiagnosed Patients with Exocrine Pancreatic Insufficiency |
title_sort | applying machine learning techniques to identify undiagnosed patients with exocrine pancreatic insufficiency |
topic | Methodology and Health Care Policy |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299452/ https://www.ncbi.nlm.nih.gov/pubmed/32685578 http://dx.doi.org/10.36469/9727 |
work_keys_str_mv | AT pyensonbruce applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT alstonmaggie applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT gombergjeffrey applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT hanfeng applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT khandelwalnikhil applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT deimotoharu applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT sonmonica applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency AT vorajaime applyingmachinelearningtechniquestoidentifyundiagnosedpatientswithexocrinepancreaticinsufficiency |