Cargando…

Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance

Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning tec...

Descripción completa

Detalles Bibliográficos
Autores principales: Roe, Kenneth D., Jawa, Vibhu, Zhang, Xiaohan, Chute, Christopher G., Epstein, Jeremy A., Matelsky, Jordan, Shpitser, Ilya, Taylor, Casey Overby
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179831/
https://www.ncbi.nlm.nih.gov/pubmed/32324754
http://dx.doi.org/10.1371/journal.pone.0231300
_version_ 1783525708518129664
author Roe, Kenneth D.
Jawa, Vibhu
Zhang, Xiaohan
Chute, Christopher G.
Epstein, Jeremy A.
Matelsky, Jordan
Shpitser, Ilya
Taylor, Casey Overby
author_facet Roe, Kenneth D.
Jawa, Vibhu
Zhang, Xiaohan
Chute, Christopher G.
Epstein, Jeremy A.
Matelsky, Jordan
Shpitser, Ilya
Taylor, Casey Overby
author_sort Roe, Kenneth D.
collection PubMed
description Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance.
format Online
Article
Text
id pubmed-7179831
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-71798312020-04-29 Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance Roe, Kenneth D. Jawa, Vibhu Zhang, Xiaohan Chute, Christopher G. Epstein, Jeremy A. Matelsky, Jordan Shpitser, Ilya Taylor, Casey Overby PLoS One Research Article Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance. Public Library of Science 2020-04-23 /pmc/articles/PMC7179831/ /pubmed/32324754 http://dx.doi.org/10.1371/journal.pone.0231300 Text en © 2020 Roe et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Roe, Kenneth D.
Jawa, Vibhu
Zhang, Xiaohan
Chute, Christopher G.
Epstein, Jeremy A.
Matelsky, Jordan
Shpitser, Ilya
Taylor, Casey Overby
Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
title Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
title_full Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
title_fullStr Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
title_full_unstemmed Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
title_short Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
title_sort feature engineering with clinical expert knowledge: a case study assessment of machine learning model complexity and performance
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179831/
https://www.ncbi.nlm.nih.gov/pubmed/32324754
http://dx.doi.org/10.1371/journal.pone.0231300
work_keys_str_mv AT roekennethd featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT jawavibhu featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT zhangxiaohan featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT chutechristopherg featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT epsteinjeremya featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT matelskyjordan featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT shpitserilya featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance
AT taylorcaseyoverby featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance