Cargando…
Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data
BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live cor...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Gunther Eysenbach
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221384/ https://www.ncbi.nlm.nih.gov/pubmed/21586386 http://dx.doi.org/10.2196/jmir.1636 |
_version_ | 1782217086051811328 |
---|---|
author | Doing-Harris, Kristina M Zeng-Treitler, Qing |
author_facet | Doing-Harris, Kristina M Zeng-Treitler, Qing |
author_sort | Doing-Harris, Kristina M |
collection | PubMed |
description | BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV. METHODS: The CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system’s utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages. RESULTS: The CAU system identified 88,994 unique terms 1- to 7-grams (“n-grams” are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed. CONCLUSION: The CAU system is effective for generating a list of candidate terms for human review during CHV development. |
format | Online Article Text |
id | pubmed-3221384 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Gunther Eysenbach |
record_format | MEDLINE/PubMed |
spelling | pubmed-32213842011-11-21 Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data Doing-Harris, Kristina M Zeng-Treitler, Qing J Med Internet Res Original Paper BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV. METHODS: The CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system’s utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages. RESULTS: The CAU system identified 88,994 unique terms 1- to 7-grams (“n-grams” are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed. CONCLUSION: The CAU system is effective for generating a list of candidate terms for human review during CHV development. Gunther Eysenbach 2011-05-17 /pmc/articles/PMC3221384/ /pubmed/21586386 http://dx.doi.org/10.2196/jmir.1636 Text en ©Kristina M Doing-Harris, Qing Zeng-Treitler. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 17.05.2011. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Doing-Harris, Kristina M Zeng-Treitler, Qing Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data |
title | Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data |
title_full | Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data |
title_fullStr | Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data |
title_full_unstemmed | Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data |
title_short | Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data |
title_sort | computer-assisted update of a consumer health vocabulary through mining of social network data |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221384/ https://www.ncbi.nlm.nih.gov/pubmed/21586386 http://dx.doi.org/10.2196/jmir.1636 |
work_keys_str_mv | AT doingharriskristinam computerassistedupdateofaconsumerhealthvocabularythroughminingofsocialnetworkdata AT zengtreitlerqing computerassistedupdateofaconsumerhealthvocabularythroughminingofsocialnetworkdata |