Cargando…
Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
BACKGROUND AND AIM: Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treate...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9606398/ https://www.ncbi.nlm.nih.gov/pubmed/36313772 http://dx.doi.org/10.3389/fendo.2022.1011492 |
_version_ | 1784818288817078272 |
---|---|
author | Zhu, Xiuqing Hu, Jinqing Xiao, Tao Huang, Shanqing Shang, Dewei Wen, Yuguan |
author_facet | Zhu, Xiuqing Hu, Jinqing Xiao, Tao Huang, Shanqing Shang, Dewei Wen, Yuguan |
author_sort | Zhu, Xiuqing |
collection | PubMed |
description | BACKGROUND AND AIM: Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treated patients are rare. We aimed to forecast the PRL level in OLZ-treated patients and mine pharmacovigilance information on PRL-related adverse events by integrating ML and electronic health record (EHR) data. METHODS: Data were extracted from an EHR system to construct an ML dataset in 672×384 matrix format after preprocessing, which was subsequently randomly divided into a derivation cohort for model development and a validation cohort for model validation (8:2). The eXtreme gradient boosting (XGBoost) algorithm was used to build the ML models, the importance of the features and predictive behaviors of which were illustrated by SHapley Additive exPlanations (SHAP)-based analyses. The sequential forward feature selection approach was used to generate the optimal feature subset. The co-administered drugs that might have influenced PRL levels during OLZ treatment as identified by SHAP analyses were then compared with evidence from disproportionality analyses by using OpenVigil FDA. RESULTS: The 15 features that made the greatest contributions, as ranked by the mean (|SHAP value|), were identified as the optimal feature subset. The features were gender_male, co-administration of risperidone, age, co-administration of aripiprazole, concentration of aripiprazole, concentration of OLZ, progesterone, co-administration of sulpiride, creatine kinase, serum sodium, serum phosphorus, testosterone, platelet distribution width, α-L-fucosidase, and lipoprotein (a). The XGBoost model after feature selection delivered good performance on the validation cohort with a mean absolute error of 0.046, mean squared error of 0.0036, root-mean-squared error of 0.060, and mean relative error of 11%. Risperidone and aripiprazole exhibited the strongest associations with hyperprolactinemia and decreased blood PRL according to the disproportionality analyses, and both were identified as co-administered drugs that influenced PRL levels during OLZ treatment by SHAP analyses. CONCLUSIONS: Multiple pathophysiological and pharmacological confounders influence PRL levels associated with effective treatment and PRL-related side-effects in OLZ-treated patients. Our study highlights the feasibility of integration of ML and EHR data to facilitate the detection of PRL levels and pharmacovigilance signals in OLZ-treated patients. |
format | Online Article Text |
id | pubmed-9606398 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-96063982022-10-28 Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients Zhu, Xiuqing Hu, Jinqing Xiao, Tao Huang, Shanqing Shang, Dewei Wen, Yuguan Front Endocrinol (Lausanne) Endocrinology BACKGROUND AND AIM: Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treated patients are rare. We aimed to forecast the PRL level in OLZ-treated patients and mine pharmacovigilance information on PRL-related adverse events by integrating ML and electronic health record (EHR) data. METHODS: Data were extracted from an EHR system to construct an ML dataset in 672×384 matrix format after preprocessing, which was subsequently randomly divided into a derivation cohort for model development and a validation cohort for model validation (8:2). The eXtreme gradient boosting (XGBoost) algorithm was used to build the ML models, the importance of the features and predictive behaviors of which were illustrated by SHapley Additive exPlanations (SHAP)-based analyses. The sequential forward feature selection approach was used to generate the optimal feature subset. The co-administered drugs that might have influenced PRL levels during OLZ treatment as identified by SHAP analyses were then compared with evidence from disproportionality analyses by using OpenVigil FDA. RESULTS: The 15 features that made the greatest contributions, as ranked by the mean (|SHAP value|), were identified as the optimal feature subset. The features were gender_male, co-administration of risperidone, age, co-administration of aripiprazole, concentration of aripiprazole, concentration of OLZ, progesterone, co-administration of sulpiride, creatine kinase, serum sodium, serum phosphorus, testosterone, platelet distribution width, α-L-fucosidase, and lipoprotein (a). The XGBoost model after feature selection delivered good performance on the validation cohort with a mean absolute error of 0.046, mean squared error of 0.0036, root-mean-squared error of 0.060, and mean relative error of 11%. Risperidone and aripiprazole exhibited the strongest associations with hyperprolactinemia and decreased blood PRL according to the disproportionality analyses, and both were identified as co-administered drugs that influenced PRL levels during OLZ treatment by SHAP analyses. CONCLUSIONS: Multiple pathophysiological and pharmacological confounders influence PRL levels associated with effective treatment and PRL-related side-effects in OLZ-treated patients. Our study highlights the feasibility of integration of ML and EHR data to facilitate the detection of PRL levels and pharmacovigilance signals in OLZ-treated patients. Frontiers Media S.A. 2022-10-13 /pmc/articles/PMC9606398/ /pubmed/36313772 http://dx.doi.org/10.3389/fendo.2022.1011492 Text en Copyright © 2022 Zhu, Hu, Xiao, Huang, Shang and Wen https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Endocrinology Zhu, Xiuqing Hu, Jinqing Xiao, Tao Huang, Shanqing Shang, Dewei Wen, Yuguan Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
title | Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
title_full | Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
title_fullStr | Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
title_full_unstemmed | Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
title_short | Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
title_sort | integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients |
topic | Endocrinology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9606398/ https://www.ncbi.nlm.nih.gov/pubmed/36313772 http://dx.doi.org/10.3389/fendo.2022.1011492 |
work_keys_str_mv | AT zhuxiuqing integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients AT hujinqing integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients AT xiaotao integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients AT huangshanqing integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients AT shangdewei integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients AT wenyuguan integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients |