Cargando…

Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data

BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live cor...

Descripción completa

Detalles Bibliográficos
Autores principales: Doing-Harris, Kristina M, Zeng-Treitler, Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Gunther Eysenbach 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221384/
https://www.ncbi.nlm.nih.gov/pubmed/21586386
http://dx.doi.org/10.2196/jmir.1636
_version_ 1782217086051811328
author Doing-Harris, Kristina M
Zeng-Treitler, Qing
author_facet Doing-Harris, Kristina M
Zeng-Treitler, Qing
author_sort Doing-Harris, Kristina M
collection PubMed
description BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV. METHODS: The CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system’s utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages. RESULTS: The CAU system identified 88,994 unique terms 1- to 7-grams (“n-grams” are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed. CONCLUSION: The CAU system is effective for generating a list of candidate terms for human review during CHV development.
format Online
Article
Text
id pubmed-3221384
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Gunther Eysenbach
record_format MEDLINE/PubMed
spelling pubmed-32213842011-11-21 Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data Doing-Harris, Kristina M Zeng-Treitler, Qing J Med Internet Res Original Paper BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV. METHODS: The CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system’s utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages. RESULTS: The CAU system identified 88,994 unique terms 1- to 7-grams (“n-grams” are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed. CONCLUSION: The CAU system is effective for generating a list of candidate terms for human review during CHV development. Gunther Eysenbach 2011-05-17 /pmc/articles/PMC3221384/ /pubmed/21586386 http://dx.doi.org/10.2196/jmir.1636 Text en ©Kristina M Doing-Harris, Qing Zeng-Treitler. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 17.05.2011. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Doing-Harris, Kristina M
Zeng-Treitler, Qing
Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
title Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
title_full Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
title_fullStr Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
title_full_unstemmed Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
title_short Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
title_sort computer-assisted update of a consumer health vocabulary through mining of social network data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221384/
https://www.ncbi.nlm.nih.gov/pubmed/21586386
http://dx.doi.org/10.2196/jmir.1636
work_keys_str_mv AT doingharriskristinam computerassistedupdateofaconsumerhealthvocabularythroughminingofsocialnetworkdata
AT zengtreitlerqing computerassistedupdateofaconsumerhealthvocabularythroughminingofsocialnetworkdata