Cargando…

Outlier concepts auditing methodology for a large family of biomedical ontologies

BACKGROUND: Summarization networks are compact summaries of ontologies. The “Big Picture” view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Ling, Min, Hua, Chen, Yan, Keloth, Vipina, Geller, James, Perl, Yehoshua, Hripcsak, George
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737254/
https://www.ncbi.nlm.nih.gov/pubmed/33319713
http://dx.doi.org/10.1186/s12911-020-01311-x
_version_ 1783622909371088896
author Zheng, Ling
Min, Hua
Chen, Yan
Keloth, Vipina
Geller, James
Perl, Yehoshua
Hripcsak, George
author_facet Zheng, Ling
Min, Hua
Chen, Yan
Keloth, Vipina
Geller, James
Perl, Yehoshua
Hripcsak, George
author_sort Zheng, Ling
collection PubMed
description BACKGROUND: Summarization networks are compact summaries of ontologies. The “Big Picture” view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed the "partial-area taxonomy" summarization network. Prior research has identified one kind of outlier concepts, concepts of small partials-areas within partial-area taxonomies. Previously we have shown that the small partial-area technique works successfully for four ontologies (or their hierarchies). METHODS: To improve the Quality Assurance (QA) scalability, a family-based QA framework, where one QA technique is potentially applicable to a whole family of ontologies with similar structural features, was developed. The 373 ontologies hosted at the NCBO BioPortal in 2015 were classified into a collection of families based on structural features. A meta-ontology represents this family collection, including one family of ontologies having outgoing lateral relationships. The process of updating the current meta-ontology is described. To conclude that one QA technique is applicable for at least half of the members for a family F, this technique should be demonstrated as successful for six out of six ontologies in F. We describe a hypothesis setting the condition required for a technique to be successful for a given ontology. The process of a study to demonstrate such success is described. This paper intends to prove the scalability of the small partial-area technique. RESULTS: We first updated the meta-ontology classifying 566 BioPortal ontologies. There were 371 ontologies in the family with outgoing lateral relationships. We demonstrated the success of the small partial-area technique for two ontology hierarchies which belong to this family, SNOMED CT’s Specimen hierarchy and NCIt’s Gene hierarchy. Together with the four previous ontologies from the same family, we fulfilled the “six out of six” condition required to show the scalability for the whole family. CONCLUSIONS: We have shown that the small partial-area technique can be potentially successful for the family of ontologies with outgoing lateral relationships in BioPortal, thus improve the scalability of this QA technique.
format Online
Article
Text
id pubmed-7737254
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-77372542020-12-17 Outlier concepts auditing methodology for a large family of biomedical ontologies Zheng, Ling Min, Hua Chen, Yan Keloth, Vipina Geller, James Perl, Yehoshua Hripcsak, George BMC Med Inform Decis Mak Research BACKGROUND: Summarization networks are compact summaries of ontologies. The “Big Picture” view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed the "partial-area taxonomy" summarization network. Prior research has identified one kind of outlier concepts, concepts of small partials-areas within partial-area taxonomies. Previously we have shown that the small partial-area technique works successfully for four ontologies (or their hierarchies). METHODS: To improve the Quality Assurance (QA) scalability, a family-based QA framework, where one QA technique is potentially applicable to a whole family of ontologies with similar structural features, was developed. The 373 ontologies hosted at the NCBO BioPortal in 2015 were classified into a collection of families based on structural features. A meta-ontology represents this family collection, including one family of ontologies having outgoing lateral relationships. The process of updating the current meta-ontology is described. To conclude that one QA technique is applicable for at least half of the members for a family F, this technique should be demonstrated as successful for six out of six ontologies in F. We describe a hypothesis setting the condition required for a technique to be successful for a given ontology. The process of a study to demonstrate such success is described. This paper intends to prove the scalability of the small partial-area technique. RESULTS: We first updated the meta-ontology classifying 566 BioPortal ontologies. There were 371 ontologies in the family with outgoing lateral relationships. We demonstrated the success of the small partial-area technique for two ontology hierarchies which belong to this family, SNOMED CT’s Specimen hierarchy and NCIt’s Gene hierarchy. Together with the four previous ontologies from the same family, we fulfilled the “six out of six” condition required to show the scalability for the whole family. CONCLUSIONS: We have shown that the small partial-area technique can be potentially successful for the family of ontologies with outgoing lateral relationships in BioPortal, thus improve the scalability of this QA technique. BioMed Central 2020-12-15 /pmc/articles/PMC7737254/ /pubmed/33319713 http://dx.doi.org/10.1186/s12911-020-01311-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zheng, Ling
Min, Hua
Chen, Yan
Keloth, Vipina
Geller, James
Perl, Yehoshua
Hripcsak, George
Outlier concepts auditing methodology for a large family of biomedical ontologies
title Outlier concepts auditing methodology for a large family of biomedical ontologies
title_full Outlier concepts auditing methodology for a large family of biomedical ontologies
title_fullStr Outlier concepts auditing methodology for a large family of biomedical ontologies
title_full_unstemmed Outlier concepts auditing methodology for a large family of biomedical ontologies
title_short Outlier concepts auditing methodology for a large family of biomedical ontologies
title_sort outlier concepts auditing methodology for a large family of biomedical ontologies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737254/
https://www.ncbi.nlm.nih.gov/pubmed/33319713
http://dx.doi.org/10.1186/s12911-020-01311-x
work_keys_str_mv AT zhengling outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies
AT minhua outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies
AT chenyan outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies
AT kelothvipina outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies
AT gellerjames outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies
AT perlyehoshua outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies
AT hripcsakgeorge outlierconceptsauditingmethodologyforalargefamilyofbiomedicalontologies