Cargando…

PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites

A particular challenge in biomedical text mining is to find ways of handling ‘comprehensive’ or ‘associative’ queries such as ‘Find all genes associated with breast cancer’. Given that many queries in genomics, proteomics or metabolomics involve these kind of comprehensive searches we believe that a...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Dean, Knox, Craig, Young, Nelson, Stothard, Paul, Damaraju, Sambasivarao, Wishart, David S.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2447794/
https://www.ncbi.nlm.nih.gov/pubmed/18487273
http://dx.doi.org/10.1093/nar/gkn296
_version_ 1782157001138110464
author Cheng, Dean
Knox, Craig
Young, Nelson
Stothard, Paul
Damaraju, Sambasivarao
Wishart, David S.
author_facet Cheng, Dean
Knox, Craig
Young, Nelson
Stothard, Paul
Damaraju, Sambasivarao
Wishart, David S.
author_sort Cheng, Dean
collection PubMed
description A particular challenge in biomedical text mining is to find ways of handling ‘comprehensive’ or ‘associative’ queries such as ‘Find all genes associated with breast cancer’. Given that many queries in genomics, proteomics or metabolomics involve these kind of comprehensive searches we believe that a web-based tool that could support these searches would be quite useful. In response to this need, we have developed the PolySearch web server. PolySearch supports >50 different classes of queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases. The typical query supported by PolySearch is ‘Given X, find all Y's’ where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs and metabolites. PolySearch also exploits a variety of techniques in text mining and information retrieval to identify, highlight and rank informative abstracts, paragraphs or sentences. PolySearch's performance has been assessed in tasks such as gene synonym identification, protein–protein interaction identification and disease gene identification using a variety of manually assembled ‘gold standard’ text corpuses. Its f-measure on these tasks is 88, 81 and 79%, respectively. These values are between 5 and 50% better than other published tools. The server is freely available at http://wishart.biology.ualberta.ca/polysearch
format Text
id pubmed-2447794
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-24477942008-07-09 PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites Cheng, Dean Knox, Craig Young, Nelson Stothard, Paul Damaraju, Sambasivarao Wishart, David S. Nucleic Acids Res Articles A particular challenge in biomedical text mining is to find ways of handling ‘comprehensive’ or ‘associative’ queries such as ‘Find all genes associated with breast cancer’. Given that many queries in genomics, proteomics or metabolomics involve these kind of comprehensive searches we believe that a web-based tool that could support these searches would be quite useful. In response to this need, we have developed the PolySearch web server. PolySearch supports >50 different classes of queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases. The typical query supported by PolySearch is ‘Given X, find all Y's’ where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs and metabolites. PolySearch also exploits a variety of techniques in text mining and information retrieval to identify, highlight and rank informative abstracts, paragraphs or sentences. PolySearch's performance has been assessed in tasks such as gene synonym identification, protein–protein interaction identification and disease gene identification using a variety of manually assembled ‘gold standard’ text corpuses. Its f-measure on these tasks is 88, 81 and 79%, respectively. These values are between 5 and 50% better than other published tools. The server is freely available at http://wishart.biology.ualberta.ca/polysearch Oxford University Press 2008-07-01 2008-05-16 /pmc/articles/PMC2447794/ /pubmed/18487273 http://dx.doi.org/10.1093/nar/gkn296 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Cheng, Dean
Knox, Craig
Young, Nelson
Stothard, Paul
Damaraju, Sambasivarao
Wishart, David S.
PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
title PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
title_full PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
title_fullStr PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
title_full_unstemmed PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
title_short PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
title_sort polysearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2447794/
https://www.ncbi.nlm.nih.gov/pubmed/18487273
http://dx.doi.org/10.1093/nar/gkn296
work_keys_str_mv AT chengdean polysearchawebbasedtextminingsystemforextractingrelationshipsbetweenhumandiseasesgenesmutationsdrugsandmetabolites
AT knoxcraig polysearchawebbasedtextminingsystemforextractingrelationshipsbetweenhumandiseasesgenesmutationsdrugsandmetabolites
AT youngnelson polysearchawebbasedtextminingsystemforextractingrelationshipsbetweenhumandiseasesgenesmutationsdrugsandmetabolites
AT stothardpaul polysearchawebbasedtextminingsystemforextractingrelationshipsbetweenhumandiseasesgenesmutationsdrugsandmetabolites
AT damarajusambasivarao polysearchawebbasedtextminingsystemforextractingrelationshipsbetweenhumandiseasesgenesmutationsdrugsandmetabolites
AT wishartdavids polysearchawebbasedtextminingsystemforextractingrelationshipsbetweenhumandiseasesgenesmutationsdrugsandmetabolites