Cargando…
Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance
Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning tec...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179831/ https://www.ncbi.nlm.nih.gov/pubmed/32324754 http://dx.doi.org/10.1371/journal.pone.0231300 |
_version_ | 1783525708518129664 |
---|---|
author | Roe, Kenneth D. Jawa, Vibhu Zhang, Xiaohan Chute, Christopher G. Epstein, Jeremy A. Matelsky, Jordan Shpitser, Ilya Taylor, Casey Overby |
author_facet | Roe, Kenneth D. Jawa, Vibhu Zhang, Xiaohan Chute, Christopher G. Epstein, Jeremy A. Matelsky, Jordan Shpitser, Ilya Taylor, Casey Overby |
author_sort | Roe, Kenneth D. |
collection | PubMed |
description | Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance. |
format | Online Article Text |
id | pubmed-7179831 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-71798312020-04-29 Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance Roe, Kenneth D. Jawa, Vibhu Zhang, Xiaohan Chute, Christopher G. Epstein, Jeremy A. Matelsky, Jordan Shpitser, Ilya Taylor, Casey Overby PLoS One Research Article Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance. Public Library of Science 2020-04-23 /pmc/articles/PMC7179831/ /pubmed/32324754 http://dx.doi.org/10.1371/journal.pone.0231300 Text en © 2020 Roe et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Roe, Kenneth D. Jawa, Vibhu Zhang, Xiaohan Chute, Christopher G. Epstein, Jeremy A. Matelsky, Jordan Shpitser, Ilya Taylor, Casey Overby Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance |
title | Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance |
title_full | Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance |
title_fullStr | Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance |
title_full_unstemmed | Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance |
title_short | Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance |
title_sort | feature engineering with clinical expert knowledge: a case study assessment of machine learning model complexity and performance |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179831/ https://www.ncbi.nlm.nih.gov/pubmed/32324754 http://dx.doi.org/10.1371/journal.pone.0231300 |
work_keys_str_mv | AT roekennethd featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT jawavibhu featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT zhangxiaohan featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT chutechristopherg featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT epsteinjeremya featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT matelskyjordan featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT shpitserilya featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance AT taylorcaseyoverby featureengineeringwithclinicalexpertknowledgeacasestudyassessmentofmachinelearningmodelcomplexityandperformance |