Cargando…

Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study

BACKGROUND: With the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater ava...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yang, Liu, Xuejiao, Gao, Lei, Zhu, Miao, Shia, Ben-Chang, Chen, Mingchih, Ye, Linglong, Qin, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337515/
https://www.ncbi.nlm.nih.gov/pubmed/37335616
http://dx.doi.org/10.2196/41576
_version_ 1785071441896538112
author Chen, Yang
Liu, Xuejiao
Gao, Lei
Zhu, Miao
Shia, Ben-Chang
Chen, Mingchih
Ye, Linglong
Qin, Lei
author_facet Chen, Yang
Liu, Xuejiao
Gao, Lei
Zhu, Miao
Shia, Ben-Chang
Chen, Mingchih
Ye, Linglong
Qin, Lei
author_sort Chen, Yang
collection PubMed
description BACKGROUND: With the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. OBJECTIVE: On the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. METHODS: Data were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. RESULTS: Among the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). CONCLUSIONS: When monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness.
format Online
Article
Text
id pubmed-10337515
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-103375152023-07-13 Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study Chen, Yang Liu, Xuejiao Gao, Lei Zhu, Miao Shia, Ben-Chang Chen, Mingchih Ye, Linglong Qin, Lei JMIR Med Inform Original Paper BACKGROUND: With the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. OBJECTIVE: On the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. METHODS: Data were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. RESULTS: Among the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). CONCLUSIONS: When monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness. JMIR Publications 2023-06-19 /pmc/articles/PMC10337515/ /pubmed/37335616 http://dx.doi.org/10.2196/41576 Text en ©Yang Chen, Xuejiao Liu, Lei Gao, Miao Zhu, Ben-Chang Shia, Mingchih Chen, Linglong Ye, Lei Qin. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 19.06.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Chen, Yang
Liu, Xuejiao
Gao, Lei
Zhu, Miao
Shia, Ben-Chang
Chen, Mingchih
Ye, Linglong
Qin, Lei
Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_full Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_fullStr Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_full_unstemmed Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_short Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_sort using the h2o automatic machine learning algorithms to identify predictors of web-based medical record nonuse among patients in a data-rich environment: mixed methods study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337515/
https://www.ncbi.nlm.nih.gov/pubmed/37335616
http://dx.doi.org/10.2196/41576
work_keys_str_mv AT chenyang usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT liuxuejiao usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT gaolei usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT zhumiao usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT shiabenchang usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT chenmingchih usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT yelinglong usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT qinlei usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy