Cargando…

Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach

BACKGROUND: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mini...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jia, Liu, Minghui, Li, Xiaojun, Liu, Xuan, Liu, Jingfang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6117498/
https://www.ncbi.nlm.nih.gov/pubmed/30115610
http://dx.doi.org/10.2196/jmir.8868
_version_ 1783351772102787072
author Li, Jia
Liu, Minghui
Li, Xiaojun
Liu, Xuan
Liu, Jingfang
author_facet Li, Jia
Liu, Minghui
Li, Xiaojun
Liu, Xuan
Liu, Jingfang
author_sort Li, Jia
collection PubMed
description BACKGROUND: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mining physician reviews is to determine how the natural structure or dimensions is embedded in reviews. Therefore, it is relevant to develop the topic taxonomy rigorously and systematically. OBJECTIVE: This study aims to develop a hierarchical topic taxonomy to uncover the latent structure of physician reviews and illustrate its application for mining patients’ interests based on the proposed taxonomy and algorithm. METHODS: Data comprised 122,716 physician reviews, including reviews of 8501 doctors from a leading physician review website in China (haodf.com), collected between 2007 and 2015. Mixed methods, including a literature review, data-driven-based topic discovery, and human annotation were used to develop the physician review topic taxonomy. RESULTS: The identified taxonomy included 3 domains or high-level categories and 9 subtopics or low-level categories. The physician-related domain included the categories of medical ethics, medical competence, communication skills, medical advice, and prescriptions. The patient-related domain included the categories of the patient profile, symptoms, diagnosis, and pathogenesis. The system-related domain included the categories of financing and operation process. The F-measure of the proposed classification algorithm reached 0.816 on average. Symptoms (Cohen d=1.58, Δu=0.216, t=229.75, and P<.001) are more often mentioned by patients with acute diseases, whereas communication skills (Cohen d=−0.29, Δu=−0.038, t=−42.01, and P<.001), financing (Cohen d=−0.68, Δu=−0.098, t=−99.26, and P<.001), and diagnosis and pathogenesis (Cohen d=−0.55, Δu=−0.078, t=−80.09, and P<.001) are more often mentioned by patients with chronic diseases. Patients with mild diseases were more interested in medical ethics (Cohen d=0.25, Δu 0.039, t=8.33, and P<.001), operation process (Cohen d=0.57, Δu 0.060, t=18.75, and P<.001), patient profile (Cohen d=1.19, Δu 0.132, t=39.33, and P<.001), and symptoms (Cohen d=1.91, Δu=0.274, t=62.82, and P<.001). Meanwhile, patients with serious diseases were more interested in medical competence (Cohen d=−0.99, Δu=−0.165, t=−32.58, and P<.001), medical advice and prescription (Cohen d=−0.65, Δu=−0.082, t=−21.45, and P<.001), financing (Cohen d=−0.26, Δu=−0.018, t=−8.45, and P<.001), and diagnosis and pathogenesis (Cohen d=−1.55, Δu=−0.229, t=−50.93, and P<.001). CONCLUSIONS: This mixed-methods approach, integrating literature reviews, data-driven topic discovery, and human annotation, is an effective and rigorous way to develop a physician review topic taxonomy. The proposed algorithm based on Labeled-Latent Dirichlet Allocation can achieve impressive classification results for mining patients’ interests. Furthermore, the mining results reveal marked differences in patients’ interests across different disease types, socioeconomic development levels, and hospital levels.
format Online
Article
Text
id pubmed-6117498
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-61174982018-09-06 Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach Li, Jia Liu, Minghui Li, Xiaojun Liu, Xuan Liu, Jingfang J Med Internet Res Original Paper BACKGROUND: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mining physician reviews is to determine how the natural structure or dimensions is embedded in reviews. Therefore, it is relevant to develop the topic taxonomy rigorously and systematically. OBJECTIVE: This study aims to develop a hierarchical topic taxonomy to uncover the latent structure of physician reviews and illustrate its application for mining patients’ interests based on the proposed taxonomy and algorithm. METHODS: Data comprised 122,716 physician reviews, including reviews of 8501 doctors from a leading physician review website in China (haodf.com), collected between 2007 and 2015. Mixed methods, including a literature review, data-driven-based topic discovery, and human annotation were used to develop the physician review topic taxonomy. RESULTS: The identified taxonomy included 3 domains or high-level categories and 9 subtopics or low-level categories. The physician-related domain included the categories of medical ethics, medical competence, communication skills, medical advice, and prescriptions. The patient-related domain included the categories of the patient profile, symptoms, diagnosis, and pathogenesis. The system-related domain included the categories of financing and operation process. The F-measure of the proposed classification algorithm reached 0.816 on average. Symptoms (Cohen d=1.58, Δu=0.216, t=229.75, and P<.001) are more often mentioned by patients with acute diseases, whereas communication skills (Cohen d=−0.29, Δu=−0.038, t=−42.01, and P<.001), financing (Cohen d=−0.68, Δu=−0.098, t=−99.26, and P<.001), and diagnosis and pathogenesis (Cohen d=−0.55, Δu=−0.078, t=−80.09, and P<.001) are more often mentioned by patients with chronic diseases. Patients with mild diseases were more interested in medical ethics (Cohen d=0.25, Δu 0.039, t=8.33, and P<.001), operation process (Cohen d=0.57, Δu 0.060, t=18.75, and P<.001), patient profile (Cohen d=1.19, Δu 0.132, t=39.33, and P<.001), and symptoms (Cohen d=1.91, Δu=0.274, t=62.82, and P<.001). Meanwhile, patients with serious diseases were more interested in medical competence (Cohen d=−0.99, Δu=−0.165, t=−32.58, and P<.001), medical advice and prescription (Cohen d=−0.65, Δu=−0.082, t=−21.45, and P<.001), financing (Cohen d=−0.26, Δu=−0.018, t=−8.45, and P<.001), and diagnosis and pathogenesis (Cohen d=−1.55, Δu=−0.229, t=−50.93, and P<.001). CONCLUSIONS: This mixed-methods approach, integrating literature reviews, data-driven topic discovery, and human annotation, is an effective and rigorous way to develop a physician review topic taxonomy. The proposed algorithm based on Labeled-Latent Dirichlet Allocation can achieve impressive classification results for mining patients’ interests. Furthermore, the mining results reveal marked differences in patients’ interests across different disease types, socioeconomic development levels, and hospital levels. JMIR Publications 2018-08-16 /pmc/articles/PMC6117498/ /pubmed/30115610 http://dx.doi.org/10.2196/jmir.8868 Text en ©Jia Li, Minghui Liu, Xiaojun Li, Xuan Liu, Jingfang Liu. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.08.2018. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Li, Jia
Liu, Minghui
Li, Xiaojun
Liu, Xuan
Liu, Jingfang
Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
title Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
title_full Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
title_fullStr Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
title_full_unstemmed Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
title_short Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
title_sort developing embedded taxonomy and mining patients’ interests from web-based physician reviews: mixed-methods approach
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6117498/
https://www.ncbi.nlm.nih.gov/pubmed/30115610
http://dx.doi.org/10.2196/jmir.8868
work_keys_str_mv AT lijia developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach
AT liuminghui developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach
AT lixiaojun developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach
AT liuxuan developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach
AT liujingfang developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach