Cargando…
Mining the Gene Wiki for functional genomic knowledge
BACKGROUND: Ontology-based gene annotations are important tools for organizing and analyzing genome-scale biological data. Collecting these annotations is a valuable but costly endeavor. The Gene Wiki makes use of Wikipedia as a low-cost, mass-collaborative platform for assembling text-based gene an...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3271090/ https://www.ncbi.nlm.nih.gov/pubmed/22165947 http://dx.doi.org/10.1186/1471-2164-12-603 |
_version_ | 1782222653916971008 |
---|---|
author | Good, Benjamin M Howe, Douglas G Lin, Simon M Kibbe, Warren A Su, Andrew I |
author_facet | Good, Benjamin M Howe, Douglas G Lin, Simon M Kibbe, Warren A Su, Andrew I |
author_sort | Good, Benjamin M |
collection | PubMed |
description | BACKGROUND: Ontology-based gene annotations are important tools for organizing and analyzing genome-scale biological data. Collecting these annotations is a valuable but costly endeavor. The Gene Wiki makes use of Wikipedia as a low-cost, mass-collaborative platform for assembling text-based gene annotations. The Gene Wiki is comprised of more than 10,000 review articles, each describing one human gene. The goal of this study is to define and assess a computational strategy for translating the text of Gene Wiki articles into ontology-based gene annotations. We specifically explore the generation of structured annotations using the Gene Ontology and the Human Disease Ontology. RESULTS: Our system produced 2,983 candidate gene annotations using the Disease Ontology and 11,022 candidate annotations using the Gene Ontology from the text of the Gene Wiki. Based on manual evaluations and comparisons to reference annotation sets, we estimate a precision of 90-93% for the Disease Ontology annotations and 48-64% for the Gene Ontology annotations. We further demonstrate that this data set can systematically improve the results from gene set enrichment analyses. CONCLUSIONS: The Gene Wiki is a rapidly growing corpus of text focused on human gene function. Here, we demonstrate that the Gene Wiki can be a powerful resource for generating ontology-based gene annotations. These annotations can be used immediately to improve workflows for building curated gene annotation databases and knowledge-based statistical analyses. |
format | Online Article Text |
id | pubmed-3271090 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32710902012-02-08 Mining the Gene Wiki for functional genomic knowledge Good, Benjamin M Howe, Douglas G Lin, Simon M Kibbe, Warren A Su, Andrew I BMC Genomics Research Article BACKGROUND: Ontology-based gene annotations are important tools for organizing and analyzing genome-scale biological data. Collecting these annotations is a valuable but costly endeavor. The Gene Wiki makes use of Wikipedia as a low-cost, mass-collaborative platform for assembling text-based gene annotations. The Gene Wiki is comprised of more than 10,000 review articles, each describing one human gene. The goal of this study is to define and assess a computational strategy for translating the text of Gene Wiki articles into ontology-based gene annotations. We specifically explore the generation of structured annotations using the Gene Ontology and the Human Disease Ontology. RESULTS: Our system produced 2,983 candidate gene annotations using the Disease Ontology and 11,022 candidate annotations using the Gene Ontology from the text of the Gene Wiki. Based on manual evaluations and comparisons to reference annotation sets, we estimate a precision of 90-93% for the Disease Ontology annotations and 48-64% for the Gene Ontology annotations. We further demonstrate that this data set can systematically improve the results from gene set enrichment analyses. CONCLUSIONS: The Gene Wiki is a rapidly growing corpus of text focused on human gene function. Here, we demonstrate that the Gene Wiki can be a powerful resource for generating ontology-based gene annotations. These annotations can be used immediately to improve workflows for building curated gene annotation databases and knowledge-based statistical analyses. BioMed Central 2011-12-13 /pmc/articles/PMC3271090/ /pubmed/22165947 http://dx.doi.org/10.1186/1471-2164-12-603 Text en Copyright ©2011 Good et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Good, Benjamin M Howe, Douglas G Lin, Simon M Kibbe, Warren A Su, Andrew I Mining the Gene Wiki for functional genomic knowledge |
title | Mining the Gene Wiki for functional genomic knowledge |
title_full | Mining the Gene Wiki for functional genomic knowledge |
title_fullStr | Mining the Gene Wiki for functional genomic knowledge |
title_full_unstemmed | Mining the Gene Wiki for functional genomic knowledge |
title_short | Mining the Gene Wiki for functional genomic knowledge |
title_sort | mining the gene wiki for functional genomic knowledge |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3271090/ https://www.ncbi.nlm.nih.gov/pubmed/22165947 http://dx.doi.org/10.1186/1471-2164-12-603 |
work_keys_str_mv | AT goodbenjaminm miningthegenewikiforfunctionalgenomicknowledge AT howedouglasg miningthegenewikiforfunctionalgenomicknowledge AT linsimonm miningthegenewikiforfunctionalgenomicknowledge AT kibbewarrena miningthegenewikiforfunctionalgenomicknowledge AT suandrewi miningthegenewikiforfunctionalgenomicknowledge |