Cargando…
Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
BACKGROUND: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mini...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6117498/ https://www.ncbi.nlm.nih.gov/pubmed/30115610 http://dx.doi.org/10.2196/jmir.8868 |
_version_ | 1783351772102787072 |
---|---|
author | Li, Jia Liu, Minghui Li, Xiaojun Liu, Xuan Liu, Jingfang |
author_facet | Li, Jia Liu, Minghui Li, Xiaojun Liu, Xuan Liu, Jingfang |
author_sort | Li, Jia |
collection | PubMed |
description | BACKGROUND: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mining physician reviews is to determine how the natural structure or dimensions is embedded in reviews. Therefore, it is relevant to develop the topic taxonomy rigorously and systematically. OBJECTIVE: This study aims to develop a hierarchical topic taxonomy to uncover the latent structure of physician reviews and illustrate its application for mining patients’ interests based on the proposed taxonomy and algorithm. METHODS: Data comprised 122,716 physician reviews, including reviews of 8501 doctors from a leading physician review website in China (haodf.com), collected between 2007 and 2015. Mixed methods, including a literature review, data-driven-based topic discovery, and human annotation were used to develop the physician review topic taxonomy. RESULTS: The identified taxonomy included 3 domains or high-level categories and 9 subtopics or low-level categories. The physician-related domain included the categories of medical ethics, medical competence, communication skills, medical advice, and prescriptions. The patient-related domain included the categories of the patient profile, symptoms, diagnosis, and pathogenesis. The system-related domain included the categories of financing and operation process. The F-measure of the proposed classification algorithm reached 0.816 on average. Symptoms (Cohen d=1.58, Δu=0.216, t=229.75, and P<.001) are more often mentioned by patients with acute diseases, whereas communication skills (Cohen d=−0.29, Δu=−0.038, t=−42.01, and P<.001), financing (Cohen d=−0.68, Δu=−0.098, t=−99.26, and P<.001), and diagnosis and pathogenesis (Cohen d=−0.55, Δu=−0.078, t=−80.09, and P<.001) are more often mentioned by patients with chronic diseases. Patients with mild diseases were more interested in medical ethics (Cohen d=0.25, Δu 0.039, t=8.33, and P<.001), operation process (Cohen d=0.57, Δu 0.060, t=18.75, and P<.001), patient profile (Cohen d=1.19, Δu 0.132, t=39.33, and P<.001), and symptoms (Cohen d=1.91, Δu=0.274, t=62.82, and P<.001). Meanwhile, patients with serious diseases were more interested in medical competence (Cohen d=−0.99, Δu=−0.165, t=−32.58, and P<.001), medical advice and prescription (Cohen d=−0.65, Δu=−0.082, t=−21.45, and P<.001), financing (Cohen d=−0.26, Δu=−0.018, t=−8.45, and P<.001), and diagnosis and pathogenesis (Cohen d=−1.55, Δu=−0.229, t=−50.93, and P<.001). CONCLUSIONS: This mixed-methods approach, integrating literature reviews, data-driven topic discovery, and human annotation, is an effective and rigorous way to develop a physician review topic taxonomy. The proposed algorithm based on Labeled-Latent Dirichlet Allocation can achieve impressive classification results for mining patients’ interests. Furthermore, the mining results reveal marked differences in patients’ interests across different disease types, socioeconomic development levels, and hospital levels. |
format | Online Article Text |
id | pubmed-6117498 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-61174982018-09-06 Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach Li, Jia Liu, Minghui Li, Xiaojun Liu, Xuan Liu, Jingfang J Med Internet Res Original Paper BACKGROUND: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mining physician reviews is to determine how the natural structure or dimensions is embedded in reviews. Therefore, it is relevant to develop the topic taxonomy rigorously and systematically. OBJECTIVE: This study aims to develop a hierarchical topic taxonomy to uncover the latent structure of physician reviews and illustrate its application for mining patients’ interests based on the proposed taxonomy and algorithm. METHODS: Data comprised 122,716 physician reviews, including reviews of 8501 doctors from a leading physician review website in China (haodf.com), collected between 2007 and 2015. Mixed methods, including a literature review, data-driven-based topic discovery, and human annotation were used to develop the physician review topic taxonomy. RESULTS: The identified taxonomy included 3 domains or high-level categories and 9 subtopics or low-level categories. The physician-related domain included the categories of medical ethics, medical competence, communication skills, medical advice, and prescriptions. The patient-related domain included the categories of the patient profile, symptoms, diagnosis, and pathogenesis. The system-related domain included the categories of financing and operation process. The F-measure of the proposed classification algorithm reached 0.816 on average. Symptoms (Cohen d=1.58, Δu=0.216, t=229.75, and P<.001) are more often mentioned by patients with acute diseases, whereas communication skills (Cohen d=−0.29, Δu=−0.038, t=−42.01, and P<.001), financing (Cohen d=−0.68, Δu=−0.098, t=−99.26, and P<.001), and diagnosis and pathogenesis (Cohen d=−0.55, Δu=−0.078, t=−80.09, and P<.001) are more often mentioned by patients with chronic diseases. Patients with mild diseases were more interested in medical ethics (Cohen d=0.25, Δu 0.039, t=8.33, and P<.001), operation process (Cohen d=0.57, Δu 0.060, t=18.75, and P<.001), patient profile (Cohen d=1.19, Δu 0.132, t=39.33, and P<.001), and symptoms (Cohen d=1.91, Δu=0.274, t=62.82, and P<.001). Meanwhile, patients with serious diseases were more interested in medical competence (Cohen d=−0.99, Δu=−0.165, t=−32.58, and P<.001), medical advice and prescription (Cohen d=−0.65, Δu=−0.082, t=−21.45, and P<.001), financing (Cohen d=−0.26, Δu=−0.018, t=−8.45, and P<.001), and diagnosis and pathogenesis (Cohen d=−1.55, Δu=−0.229, t=−50.93, and P<.001). CONCLUSIONS: This mixed-methods approach, integrating literature reviews, data-driven topic discovery, and human annotation, is an effective and rigorous way to develop a physician review topic taxonomy. The proposed algorithm based on Labeled-Latent Dirichlet Allocation can achieve impressive classification results for mining patients’ interests. Furthermore, the mining results reveal marked differences in patients’ interests across different disease types, socioeconomic development levels, and hospital levels. JMIR Publications 2018-08-16 /pmc/articles/PMC6117498/ /pubmed/30115610 http://dx.doi.org/10.2196/jmir.8868 Text en ©Jia Li, Minghui Liu, Xiaojun Li, Xuan Liu, Jingfang Liu. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.08.2018. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Li, Jia Liu, Minghui Li, Xiaojun Liu, Xuan Liu, Jingfang Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach |
title | Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach |
title_full | Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach |
title_fullStr | Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach |
title_full_unstemmed | Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach |
title_short | Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach |
title_sort | developing embedded taxonomy and mining patients’ interests from web-based physician reviews: mixed-methods approach |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6117498/ https://www.ncbi.nlm.nih.gov/pubmed/30115610 http://dx.doi.org/10.2196/jmir.8868 |
work_keys_str_mv | AT lijia developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach AT liuminghui developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach AT lixiaojun developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach AT liuxuan developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach AT liujingfang developingembeddedtaxonomyandminingpatientsinterestsfromwebbasedphysicianreviewsmixedmethodsapproach |