Cargando…

Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data

PURPOSE: To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS: We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybr...

Descripción completa

Detalles Bibliográficos
Autores principales: Cui, Licong, Abeysinghe, Rashmie, Zheng, Fengbo, Tao, Shiqiang, Zeng, Ningzhou, Hands, Isaac, Durbin, Eric B., Whiteman, Lori, Remennik, Lyubov, Sioutos, Nicholas, Zhang, Guo-Qiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Clinical Oncology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265791/
https://www.ncbi.nlm.nih.gov/pubmed/32374632
http://dx.doi.org/10.1200/CCI.19.00124
Descripción
Sumario:PURPOSE: To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS: We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybrid auditing method consisted of three main steps: computing nonlattice subgraphs, constructing lexical features for concepts in each subgraph, and performing subsumption reasoning with each subgraph to automatically suggest potentially missing is-a relations. RESULTS: A total of 9,512 nonlattice subgraphs were obtained. Our method identified 925 potentially missing is-a relations in 441 nonlattice subgraphs; 72 of 176 reviewed samples were confirmed as valid missing is-a relations and have been incorporated in the newer versions of the NCI Thesaurus. CONCLUSION: Autosuggested changes resulting from our auditing method can improve the structural organization of the NCI Thesaurus in supporting its new role for faceted query.