Cargando…

Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data

PURPOSE: To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS: We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybr...

Descripción completa

Detalles Bibliográficos
Autores principales: Cui, Licong, Abeysinghe, Rashmie, Zheng, Fengbo, Tao, Shiqiang, Zeng, Ningzhou, Hands, Isaac, Durbin, Eric B., Whiteman, Lori, Remennik, Lyubov, Sioutos, Nicholas, Zhang, Guo-Qiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Clinical Oncology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265791/
https://www.ncbi.nlm.nih.gov/pubmed/32374632
http://dx.doi.org/10.1200/CCI.19.00124
_version_ 1783541190058049536
author Cui, Licong
Abeysinghe, Rashmie
Zheng, Fengbo
Tao, Shiqiang
Zeng, Ningzhou
Hands, Isaac
Durbin, Eric B.
Whiteman, Lori
Remennik, Lyubov
Sioutos, Nicholas
Zhang, Guo-Qiang
author_facet Cui, Licong
Abeysinghe, Rashmie
Zheng, Fengbo
Tao, Shiqiang
Zeng, Ningzhou
Hands, Isaac
Durbin, Eric B.
Whiteman, Lori
Remennik, Lyubov
Sioutos, Nicholas
Zhang, Guo-Qiang
author_sort Cui, Licong
collection PubMed
description PURPOSE: To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS: We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybrid auditing method consisted of three main steps: computing nonlattice subgraphs, constructing lexical features for concepts in each subgraph, and performing subsumption reasoning with each subgraph to automatically suggest potentially missing is-a relations. RESULTS: A total of 9,512 nonlattice subgraphs were obtained. Our method identified 925 potentially missing is-a relations in 441 nonlattice subgraphs; 72 of 176 reviewed samples were confirmed as valid missing is-a relations and have been incorporated in the newer versions of the NCI Thesaurus. CONCLUSION: Autosuggested changes resulting from our auditing method can improve the structural organization of the NCI Thesaurus in supporting its new role for faceted query.
format Online
Article
Text
id pubmed-7265791
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society of Clinical Oncology
record_format MEDLINE/PubMed
spelling pubmed-72657912021-05-06 Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data Cui, Licong Abeysinghe, Rashmie Zheng, Fengbo Tao, Shiqiang Zeng, Ningzhou Hands, Isaac Durbin, Eric B. Whiteman, Lori Remennik, Lyubov Sioutos, Nicholas Zhang, Guo-Qiang JCO Clin Cancer Inform ORIGINAL REPORTS PURPOSE: To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS: We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybrid auditing method consisted of three main steps: computing nonlattice subgraphs, constructing lexical features for concepts in each subgraph, and performing subsumption reasoning with each subgraph to automatically suggest potentially missing is-a relations. RESULTS: A total of 9,512 nonlattice subgraphs were obtained. Our method identified 925 potentially missing is-a relations in 441 nonlattice subgraphs; 72 of 176 reviewed samples were confirmed as valid missing is-a relations and have been incorporated in the newer versions of the NCI Thesaurus. CONCLUSION: Autosuggested changes resulting from our auditing method can improve the structural organization of the NCI Thesaurus in supporting its new role for faceted query. American Society of Clinical Oncology 2020-05-06 /pmc/articles/PMC7265791/ /pubmed/32374632 http://dx.doi.org/10.1200/CCI.19.00124 Text en © 2020 by American Society of Clinical Oncology https://creativecommons.org/licenses/by/4.0/ Licensed under the Creative Commons Attribution 4.0 License: https://creativecommons.org/licenses/by/4.0/
spellingShingle ORIGINAL REPORTS
Cui, Licong
Abeysinghe, Rashmie
Zheng, Fengbo
Tao, Shiqiang
Zeng, Ningzhou
Hands, Isaac
Durbin, Eric B.
Whiteman, Lori
Remennik, Lyubov
Sioutos, Nicholas
Zhang, Guo-Qiang
Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data
title Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data
title_full Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data
title_fullStr Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data
title_full_unstemmed Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data
title_short Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data
title_sort enhancing the quality of hierarchic relations in the national cancer institute thesaurus to enable faceted query of cancer registry data
topic ORIGINAL REPORTS
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265791/
https://www.ncbi.nlm.nih.gov/pubmed/32374632
http://dx.doi.org/10.1200/CCI.19.00124
work_keys_str_mv AT cuilicong enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT abeysingherashmie enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT zhengfengbo enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT taoshiqiang enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT zengningzhou enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT handsisaac enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT durbinericb enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT whitemanlori enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT remenniklyubov enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT sioutosnicholas enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata
AT zhangguoqiang enhancingthequalityofhierarchicrelationsinthenationalcancerinstitutethesaurustoenablefacetedqueryofcancerregistrydata