Cargando…

Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients

BACKGROUND AND AIM: Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treate...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Xiuqing, Hu, Jinqing, Xiao, Tao, Huang, Shanqing, Shang, Dewei, Wen, Yuguan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9606398/
https://www.ncbi.nlm.nih.gov/pubmed/36313772
http://dx.doi.org/10.3389/fendo.2022.1011492
_version_ 1784818288817078272
author Zhu, Xiuqing
Hu, Jinqing
Xiao, Tao
Huang, Shanqing
Shang, Dewei
Wen, Yuguan
author_facet Zhu, Xiuqing
Hu, Jinqing
Xiao, Tao
Huang, Shanqing
Shang, Dewei
Wen, Yuguan
author_sort Zhu, Xiuqing
collection PubMed
description BACKGROUND AND AIM: Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treated patients are rare. We aimed to forecast the PRL level in OLZ-treated patients and mine pharmacovigilance information on PRL-related adverse events by integrating ML and electronic health record (EHR) data. METHODS: Data were extracted from an EHR system to construct an ML dataset in 672×384 matrix format after preprocessing, which was subsequently randomly divided into a derivation cohort for model development and a validation cohort for model validation (8:2). The eXtreme gradient boosting (XGBoost) algorithm was used to build the ML models, the importance of the features and predictive behaviors of which were illustrated by SHapley Additive exPlanations (SHAP)-based analyses. The sequential forward feature selection approach was used to generate the optimal feature subset. The co-administered drugs that might have influenced PRL levels during OLZ treatment as identified by SHAP analyses were then compared with evidence from disproportionality analyses by using OpenVigil FDA. RESULTS: The 15 features that made the greatest contributions, as ranked by the mean (|SHAP value|), were identified as the optimal feature subset. The features were gender_male, co-administration of risperidone, age, co-administration of aripiprazole, concentration of aripiprazole, concentration of OLZ, progesterone, co-administration of sulpiride, creatine kinase, serum sodium, serum phosphorus, testosterone, platelet distribution width, α-L-fucosidase, and lipoprotein (a). The XGBoost model after feature selection delivered good performance on the validation cohort with a mean absolute error of 0.046, mean squared error of 0.0036, root-mean-squared error of 0.060, and mean relative error of 11%. Risperidone and aripiprazole exhibited the strongest associations with hyperprolactinemia and decreased blood PRL according to the disproportionality analyses, and both were identified as co-administered drugs that influenced PRL levels during OLZ treatment by SHAP analyses. CONCLUSIONS: Multiple pathophysiological and pharmacological confounders influence PRL levels associated with effective treatment and PRL-related side-effects in OLZ-treated patients. Our study highlights the feasibility of integration of ML and EHR data to facilitate the detection of PRL levels and pharmacovigilance signals in OLZ-treated patients.
format Online
Article
Text
id pubmed-9606398
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96063982022-10-28 Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients Zhu, Xiuqing Hu, Jinqing Xiao, Tao Huang, Shanqing Shang, Dewei Wen, Yuguan Front Endocrinol (Lausanne) Endocrinology BACKGROUND AND AIM: Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treated patients are rare. We aimed to forecast the PRL level in OLZ-treated patients and mine pharmacovigilance information on PRL-related adverse events by integrating ML and electronic health record (EHR) data. METHODS: Data were extracted from an EHR system to construct an ML dataset in 672×384 matrix format after preprocessing, which was subsequently randomly divided into a derivation cohort for model development and a validation cohort for model validation (8:2). The eXtreme gradient boosting (XGBoost) algorithm was used to build the ML models, the importance of the features and predictive behaviors of which were illustrated by SHapley Additive exPlanations (SHAP)-based analyses. The sequential forward feature selection approach was used to generate the optimal feature subset. The co-administered drugs that might have influenced PRL levels during OLZ treatment as identified by SHAP analyses were then compared with evidence from disproportionality analyses by using OpenVigil FDA. RESULTS: The 15 features that made the greatest contributions, as ranked by the mean (|SHAP value|), were identified as the optimal feature subset. The features were gender_male, co-administration of risperidone, age, co-administration of aripiprazole, concentration of aripiprazole, concentration of OLZ, progesterone, co-administration of sulpiride, creatine kinase, serum sodium, serum phosphorus, testosterone, platelet distribution width, α-L-fucosidase, and lipoprotein (a). The XGBoost model after feature selection delivered good performance on the validation cohort with a mean absolute error of 0.046, mean squared error of 0.0036, root-mean-squared error of 0.060, and mean relative error of 11%. Risperidone and aripiprazole exhibited the strongest associations with hyperprolactinemia and decreased blood PRL according to the disproportionality analyses, and both were identified as co-administered drugs that influenced PRL levels during OLZ treatment by SHAP analyses. CONCLUSIONS: Multiple pathophysiological and pharmacological confounders influence PRL levels associated with effective treatment and PRL-related side-effects in OLZ-treated patients. Our study highlights the feasibility of integration of ML and EHR data to facilitate the detection of PRL levels and pharmacovigilance signals in OLZ-treated patients. Frontiers Media S.A. 2022-10-13 /pmc/articles/PMC9606398/ /pubmed/36313772 http://dx.doi.org/10.3389/fendo.2022.1011492 Text en Copyright © 2022 Zhu, Hu, Xiao, Huang, Shang and Wen https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Endocrinology
Zhu, Xiuqing
Hu, Jinqing
Xiao, Tao
Huang, Shanqing
Shang, Dewei
Wen, Yuguan
Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
title Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
title_full Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
title_fullStr Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
title_full_unstemmed Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
title_short Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
title_sort integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients
topic Endocrinology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9606398/
https://www.ncbi.nlm.nih.gov/pubmed/36313772
http://dx.doi.org/10.3389/fendo.2022.1011492
work_keys_str_mv AT zhuxiuqing integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients
AT hujinqing integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients
AT xiaotao integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients
AT huangshanqing integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients
AT shangdewei integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients
AT wenyuguan integratingmachinelearningwithelectronichealthrecorddatatofacilitatedetectionofprolactinlevelandpharmacovigilancesignalsinolanzapinetreatedpatients