Cargando…

Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure

With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning (AutoML) approaches provide exciting opportunity to guide feature selection in a...

Descripción completa

Detalles Bibliográficos
Autores principales: Orlenko, Alena, Moore, Jason H., Orzechowski, Patryk, Olson, Randal S., Cairns, Junmei, Caraballo, Pedro J., Weinshilboum, Richard M., Wang, Liewei, Breitenstein, Matthew K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5882490/
https://www.ncbi.nlm.nih.gov/pubmed/29218905
_version_ 1783311480573132800
author Orlenko, Alena
Moore, Jason H.
Orzechowski, Patryk
Olson, Randal S.
Cairns, Junmei
Caraballo, Pedro J.
Weinshilboum, Richard M.
Wang, Liewei
Breitenstein, Matthew K.
author_facet Orlenko, Alena
Moore, Jason H.
Orzechowski, Patryk
Olson, Randal S.
Cairns, Junmei
Caraballo, Pedro J.
Weinshilboum, Richard M.
Wang, Liewei
Breitenstein, Matthew K.
author_sort Orlenko, Alena
collection PubMed
description With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning (AutoML) approaches provide exciting opportunity to guide feature selection in agnostic metabolic profiling endeavors, where potentially thousands of independent data points must be evaluated. In previous research, AutoML using high-dimensional data of varying types has been demonstrably robust, outperforming traditional approaches. However, considerations for application in clinical metabolic profiling remain to be evaluated. Particularly, regarding the robustness of AutoML to identify and adjust for common clinical confounders. In this study, we present a focused case study regarding AutoML considerations for using the Tree-Based Optimization Tool (TPOT) in metabolic profiling of exposure to metformin in a biobank cohort. First, we propose a tandem rank-accuracy measure to guide agnostic feature selection and corresponding threshold determination in clinical metabolic profiling endeavors. Second, while AutoML, using default parameters, demonstrated potential to lack sensitivity to low-effect confounding clinical covariates, we demonstrated residual training and adjustment of metabolite features as an easily applicable approach to ensure AutoML adjustment for potential confounding characteristics. Finally, we present increased homocysteine with long-term exposure to metformin as a potentially novel, non-replicated metabolite association suggested by TPOT; an association not identified in parallel clinical metabolic profiling endeavors. While warranting independent replication, our tandem rank-accuracy measure suggests homocysteine to be the metabolite feature with largest effect, and corresponding priority for further translational clinical research. Residual training and adjustment for a potential confounding effect by BMI only slightly modified the suggested association. Increased homocysteine is thought to be associated with vitamin B12 deficiency – evaluation for potential clinical relevance is suggested. While considerations for clinical metabolic profiling are recommended, including adjustment approaches for clinical confounders, AutoML presents an exciting tool to enhance clinical metabolic profiling and advance translational research endeavors.
format Online
Article
Text
id pubmed-5882490
institution National Center for Biotechnology Information
language English
publishDate 2018
record_format MEDLINE/PubMed
spelling pubmed-58824902018-04-04 Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure Orlenko, Alena Moore, Jason H. Orzechowski, Patryk Olson, Randal S. Cairns, Junmei Caraballo, Pedro J. Weinshilboum, Richard M. Wang, Liewei Breitenstein, Matthew K. Pac Symp Biocomput Article With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning (AutoML) approaches provide exciting opportunity to guide feature selection in agnostic metabolic profiling endeavors, where potentially thousands of independent data points must be evaluated. In previous research, AutoML using high-dimensional data of varying types has been demonstrably robust, outperforming traditional approaches. However, considerations for application in clinical metabolic profiling remain to be evaluated. Particularly, regarding the robustness of AutoML to identify and adjust for common clinical confounders. In this study, we present a focused case study regarding AutoML considerations for using the Tree-Based Optimization Tool (TPOT) in metabolic profiling of exposure to metformin in a biobank cohort. First, we propose a tandem rank-accuracy measure to guide agnostic feature selection and corresponding threshold determination in clinical metabolic profiling endeavors. Second, while AutoML, using default parameters, demonstrated potential to lack sensitivity to low-effect confounding clinical covariates, we demonstrated residual training and adjustment of metabolite features as an easily applicable approach to ensure AutoML adjustment for potential confounding characteristics. Finally, we present increased homocysteine with long-term exposure to metformin as a potentially novel, non-replicated metabolite association suggested by TPOT; an association not identified in parallel clinical metabolic profiling endeavors. While warranting independent replication, our tandem rank-accuracy measure suggests homocysteine to be the metabolite feature with largest effect, and corresponding priority for further translational clinical research. Residual training and adjustment for a potential confounding effect by BMI only slightly modified the suggested association. Increased homocysteine is thought to be associated with vitamin B12 deficiency – evaluation for potential clinical relevance is suggested. While considerations for clinical metabolic profiling are recommended, including adjustment approaches for clinical confounders, AutoML presents an exciting tool to enhance clinical metabolic profiling and advance translational research endeavors. 2018 /pmc/articles/PMC5882490/ /pubmed/29218905 Text en http://creativecommons.org/licenses/by-nc/4.0/ Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC) 4.0 License
spellingShingle Article
Orlenko, Alena
Moore, Jason H.
Orzechowski, Patryk
Olson, Randal S.
Cairns, Junmei
Caraballo, Pedro J.
Weinshilboum, Richard M.
Wang, Liewei
Breitenstein, Matthew K.
Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure
title Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure
title_full Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure
title_fullStr Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure
title_full_unstemmed Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure
title_short Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure
title_sort considerations for automated machine learning in clinical metabolic profiling: altered homocysteine plasma concentration associated with metformin exposure
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5882490/
https://www.ncbi.nlm.nih.gov/pubmed/29218905
work_keys_str_mv AT orlenkoalena considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT moorejasonh considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT orzechowskipatryk considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT olsonrandals considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT cairnsjunmei considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT caraballopedroj considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT weinshilboumrichardm considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT wangliewei considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure
AT breitensteinmatthewk considerationsforautomatedmachinelearninginclinicalmetabolicprofilingalteredhomocysteineplasmaconcentrationassociatedwithmetforminexposure