Cargando…
GO2PUB: Querying PubMed with semantic expansion of gene ontology terms
BACKGROUND: With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are neces...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599846/ https://www.ncbi.nlm.nih.gov/pubmed/22958570 http://dx.doi.org/10.1186/2041-1480-3-7 |
_version_ | 1782475543508156416 |
---|---|
author | Bettembourg, Charles Diot, Christian Burgun, Anita Dameron, Olivier |
author_facet | Bettembourg, Charles Diot, Christian Burgun, Anita Dameron, Olivier |
author_sort | Bettembourg, Charles |
collection | PubMed |
description | BACKGROUND: With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are necessary. We developed GO2PUB to automatically enrich PubMed queries with gene names, symbols and synonyms annotated by a GO term of interest or one of its descendants. RESULTS: GO2PUB enriches PubMed queries based on selected GO terms and keywords. It processes the result and displays the PMID, title, authors, abstract and bibliographic references of the articles. Gene names, symbols and synonyms that have been generated as extra keywords from the GO terms are also highlighted. GO2PUB is based on a semantic expansion of PubMed queries using the semantic inheritance between terms through the GO graph. Two experts manually assessed the relevance of GO2PUB, GoPubMed and PubMed on three queries about lipid metabolism. Experts’ agreement was high (kappa = 0.88). GO2PUB returned 69% of the relevant articles, GoPubMed: 40% and PubMed: 29%. GO2PUB and GoPubMed have 17% of their results in common, corresponding to 24% of the total number of relevant results. 70% of the articles returned by more than one tool were relevant. 36% of the relevant articles were returned only by GO2PUB, 17% only by GoPubMed and 14% only by PubMed. For determining whether these results can be generalized, we generated twenty queries based on random GO terms with a granularity similar to those of the first three queries and compared the proportions of GO2PUB and GoPubMed results. These were respectively of 77% and 40% for the first queries, and of 70% and 38% for the random queries. The two experts also assessed the relevance of seven of the twenty queries (the three related to lipid metabolism and four related to other domains). Expert agreement was high (0.93 and 0.8). GO2PUB and GoPubMed performances were similar to those of the first queries. CONCLUSIONS: We demonstrated that the use of genes annotated by either GO terms of interest or a descendant of these GO terms yields some relevant articles ignored by other tools. The comparison of GO2PUB, based on semantic expansion, with GoPubMed, based on text mining techniques, showed that both tools are complementary. The analysis of the randomly-generated queries suggests that the results obtained about lipid metabolism can be generalized to other biological processes. GO2PUB is available at http://go2pub.genouest.org. |
format | Online Article Text |
id | pubmed-3599846 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35998462013-03-17 GO2PUB: Querying PubMed with semantic expansion of gene ontology terms Bettembourg, Charles Diot, Christian Burgun, Anita Dameron, Olivier J Biomed Semantics Research BACKGROUND: With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are necessary. We developed GO2PUB to automatically enrich PubMed queries with gene names, symbols and synonyms annotated by a GO term of interest or one of its descendants. RESULTS: GO2PUB enriches PubMed queries based on selected GO terms and keywords. It processes the result and displays the PMID, title, authors, abstract and bibliographic references of the articles. Gene names, symbols and synonyms that have been generated as extra keywords from the GO terms are also highlighted. GO2PUB is based on a semantic expansion of PubMed queries using the semantic inheritance between terms through the GO graph. Two experts manually assessed the relevance of GO2PUB, GoPubMed and PubMed on three queries about lipid metabolism. Experts’ agreement was high (kappa = 0.88). GO2PUB returned 69% of the relevant articles, GoPubMed: 40% and PubMed: 29%. GO2PUB and GoPubMed have 17% of their results in common, corresponding to 24% of the total number of relevant results. 70% of the articles returned by more than one tool were relevant. 36% of the relevant articles were returned only by GO2PUB, 17% only by GoPubMed and 14% only by PubMed. For determining whether these results can be generalized, we generated twenty queries based on random GO terms with a granularity similar to those of the first three queries and compared the proportions of GO2PUB and GoPubMed results. These were respectively of 77% and 40% for the first queries, and of 70% and 38% for the random queries. The two experts also assessed the relevance of seven of the twenty queries (the three related to lipid metabolism and four related to other domains). Expert agreement was high (0.93 and 0.8). GO2PUB and GoPubMed performances were similar to those of the first queries. CONCLUSIONS: We demonstrated that the use of genes annotated by either GO terms of interest or a descendant of these GO terms yields some relevant articles ignored by other tools. The comparison of GO2PUB, based on semantic expansion, with GoPubMed, based on text mining techniques, showed that both tools are complementary. The analysis of the randomly-generated queries suggests that the results obtained about lipid metabolism can be generalized to other biological processes. GO2PUB is available at http://go2pub.genouest.org. BioMed Central 2012-09-07 /pmc/articles/PMC3599846/ /pubmed/22958570 http://dx.doi.org/10.1186/2041-1480-3-7 Text en Copyright ©2012 Bettembourg et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Bettembourg, Charles Diot, Christian Burgun, Anita Dameron, Olivier GO2PUB: Querying PubMed with semantic expansion of gene ontology terms |
title | GO2PUB: Querying PubMed with semantic expansion of gene ontology terms |
title_full | GO2PUB: Querying PubMed with semantic expansion of gene ontology terms |
title_fullStr | GO2PUB: Querying PubMed with semantic expansion of gene ontology terms |
title_full_unstemmed | GO2PUB: Querying PubMed with semantic expansion of gene ontology terms |
title_short | GO2PUB: Querying PubMed with semantic expansion of gene ontology terms |
title_sort | go2pub: querying pubmed with semantic expansion of gene ontology terms |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599846/ https://www.ncbi.nlm.nih.gov/pubmed/22958570 http://dx.doi.org/10.1186/2041-1480-3-7 |
work_keys_str_mv | AT bettembourgcharles go2pubqueryingpubmedwithsemanticexpansionofgeneontologyterms AT diotchristian go2pubqueryingpubmedwithsemanticexpansionofgeneontologyterms AT burgunanita go2pubqueryingpubmedwithsemanticexpansionofgeneontologyterms AT dameronolivier go2pubqueryingpubmedwithsemanticexpansionofgeneontologyterms |