Cargando…
GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products
BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-base...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044297/ https://www.ncbi.nlm.nih.gov/pubmed/21342572 http://dx.doi.org/10.1186/1471-2105-12-S1-S40 |
_version_ | 1782198714067058688 |
---|---|
author | Park, Yu Rang Kim, Jihun Lee, Hye Won Yoon, Young Jo Kim, Ju Han |
author_facet | Park, Yu Rang Kim, Jihun Lee, Hye Won Yoon, Young Jo Kim, Ju Han |
author_sort | Park, Yu Rang |
collection | PubMed |
description | BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations. METHODS: To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation. RESULTS: The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations. CONCLUSIONS: We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I. |
format | Text |
id | pubmed-3044297 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30442972011-02-25 GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products Park, Yu Rang Kim, Jihun Lee, Hye Won Yoon, Young Jo Kim, Ju Han BMC Bioinformatics Research BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations. METHODS: To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation. RESULTS: The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations. CONCLUSIONS: We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I. BioMed Central 2011-02-15 /pmc/articles/PMC3044297/ /pubmed/21342572 http://dx.doi.org/10.1186/1471-2105-12-S1-S40 Text en Copyright ©2011 Park et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Park, Yu Rang Kim, Jihun Lee, Hye Won Yoon, Young Jo Kim, Ju Han GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products |
title | GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products |
title_full | GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products |
title_fullStr | GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products |
title_full_unstemmed | GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products |
title_short | GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products |
title_sort | gochase-ii: correcting semantic inconsistencies from gene ontology-based annotations for gene products |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044297/ https://www.ncbi.nlm.nih.gov/pubmed/21342572 http://dx.doi.org/10.1186/1471-2105-12-S1-S40 |
work_keys_str_mv | AT parkyurang gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts AT kimjihun gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts AT leehyewon gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts AT yoonyoungjo gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts AT kimjuhan gochaseiicorrectingsemanticinconsistenciesfromgeneontologybasedannotationsforgeneproducts |