Cargando…

GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products

BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-base...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Yu Rang, Kim, Jihun, Lee, Hye Won, Yoon, Young Jo, Kim, Ju Han
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044297/
https://www.ncbi.nlm.nih.gov/pubmed/21342572
http://dx.doi.org/10.1186/1471-2105-12-S1-S40
_version_ 1782198714067058688
author Park, Yu Rang
Kim, Jihun
Lee, Hye Won
Yoon, Young Jo
Kim, Ju Han
author_facet Park, Yu Rang
Kim, Jihun
Lee, Hye Won
Yoon, Young Jo
Kim, Ju Han
author_sort Park, Yu Rang
collection PubMed
description BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations. METHODS: To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation. RESULTS: The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations. CONCLUSIONS: We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I.
format Text
id pubmed-3044297
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30442972011-02-25 GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products Park, Yu Rang Kim, Jihun Lee, Hye Won Yoon, Young Jo Kim, Ju Han BMC Bioinformatics Research BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations. METHODS: To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation. RESULTS: The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations. CONCLUSIONS: We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I. BioMed Central 2011-02-15 /pmc/articles/PMC3044297/ /pubmed/21342572 http://dx.doi.org/10.1186/1471-2105-12-S1-S40 Text en Copyright ©2011 Park et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Park, Yu Rang
Kim, Jihun
Lee, Hye Won
Yoon, Young Jo
Kim, Ju Han
GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
title GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
title_full GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
title_fullStr GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
title_full_unstemmed GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
title_short GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
title_sort gochase-ii: correcting semantic inconsistencies from gene ontology-based annotations for gene products
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044297/
https://www.ncbi.nlm.nih.gov/pubmed/21342572
http://dx.doi.org/10.1186/1471-2105-12-S1-S40
work_keys_str_mv AT parkyurang gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts
AT kimjihun gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts
AT leehyewon gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts
AT yoonyoungjo gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts
AT kimjuhan gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts