Cargando…

Computer-Assisted Update of a Consumer Health Vocabulary Through Mining of Social Network Data

BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live cor...

Descripción completa

Detalles Bibliográficos
Autores principales: Doing-Harris, Kristina M, Zeng-Treitler, Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Gunther Eysenbach 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221384/
https://www.ncbi.nlm.nih.gov/pubmed/21586386
http://dx.doi.org/10.2196/jmir.1636
Descripción
Sumario:BACKGROUND: Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE: Our objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV. METHODS: The CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system’s utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages. RESULTS: The CAU system identified 88,994 unique terms 1- to 7-grams (“n-grams” are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed. CONCLUSION: The CAU system is effective for generating a list of candidate terms for human review during CHV development.