Cargando…
Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
BACKGROUND: With the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater ava...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337515/ https://www.ncbi.nlm.nih.gov/pubmed/37335616 http://dx.doi.org/10.2196/41576 |
_version_ | 1785071441896538112 |
---|---|
author | Chen, Yang Liu, Xuejiao Gao, Lei Zhu, Miao Shia, Ben-Chang Chen, Mingchih Ye, Linglong Qin, Lei |
author_facet | Chen, Yang Liu, Xuejiao Gao, Lei Zhu, Miao Shia, Ben-Chang Chen, Mingchih Ye, Linglong Qin, Lei |
author_sort | Chen, Yang |
collection | PubMed |
description | BACKGROUND: With the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. OBJECTIVE: On the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. METHODS: Data were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. RESULTS: Among the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). CONCLUSIONS: When monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness. |
format | Online Article Text |
id | pubmed-10337515 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-103375152023-07-13 Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study Chen, Yang Liu, Xuejiao Gao, Lei Zhu, Miao Shia, Ben-Chang Chen, Mingchih Ye, Linglong Qin, Lei JMIR Med Inform Original Paper BACKGROUND: With the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. OBJECTIVE: On the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. METHODS: Data were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. RESULTS: Among the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). CONCLUSIONS: When monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness. JMIR Publications 2023-06-19 /pmc/articles/PMC10337515/ /pubmed/37335616 http://dx.doi.org/10.2196/41576 Text en ©Yang Chen, Xuejiao Liu, Lei Gao, Miao Zhu, Ben-Chang Shia, Mingchih Chen, Linglong Ye, Lei Qin. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 19.06.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Chen, Yang Liu, Xuejiao Gao, Lei Zhu, Miao Shia, Ben-Chang Chen, Mingchih Ye, Linglong Qin, Lei Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_full | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_fullStr | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_full_unstemmed | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_short | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_sort | using the h2o automatic machine learning algorithms to identify predictors of web-based medical record nonuse among patients in a data-rich environment: mixed methods study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337515/ https://www.ncbi.nlm.nih.gov/pubmed/37335616 http://dx.doi.org/10.2196/41576 |
work_keys_str_mv | AT chenyang usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT liuxuejiao usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT gaolei usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT zhumiao usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT shiabenchang usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT chenmingchih usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT yelinglong usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT qinlei usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy |