Cargando…

A literature search tool for intelligent extraction of disease-associated genes

OBJECTIVE: To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. METHODS: We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-minin...

Descripción completa

Detalles Bibliográficos
Autores principales: Jung, Jae-Yoon, DeLuca, Todd F, Nelson, Tristan H, Wall, Dennis P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3994846/
https://www.ncbi.nlm.nih.gov/pubmed/23999671
http://dx.doi.org/10.1136/amiajnl-2012-001563
_version_ 1782312790478815232
author Jung, Jae-Yoon
DeLuca, Todd F
Nelson, Tristan H
Wall, Dennis P
author_facet Jung, Jae-Yoon
DeLuca, Todd F
Nelson, Tristan H
Wall, Dennis P
author_sort Jung, Jae-Yoon
collection PubMed
description OBJECTIVE: To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. METHODS: We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. RESULTS: We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder–gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. CONCLUSIONS: We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene–disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.
format Online
Article
Text
id pubmed-3994846
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-39948462014-04-22 A literature search tool for intelligent extraction of disease-associated genes Jung, Jae-Yoon DeLuca, Todd F Nelson, Tristan H Wall, Dennis P J Am Med Inform Assoc Research and Applications OBJECTIVE: To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. METHODS: We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. RESULTS: We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder–gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. CONCLUSIONS: We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene–disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately. BMJ Publishing Group 2014-05 2013-09-02 /pmc/articles/PMC3994846/ /pubmed/23999671 http://dx.doi.org/10.1136/amiajnl-2012-001563 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Research and Applications
Jung, Jae-Yoon
DeLuca, Todd F
Nelson, Tristan H
Wall, Dennis P
A literature search tool for intelligent extraction of disease-associated genes
title A literature search tool for intelligent extraction of disease-associated genes
title_full A literature search tool for intelligent extraction of disease-associated genes
title_fullStr A literature search tool for intelligent extraction of disease-associated genes
title_full_unstemmed A literature search tool for intelligent extraction of disease-associated genes
title_short A literature search tool for intelligent extraction of disease-associated genes
title_sort literature search tool for intelligent extraction of disease-associated genes
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3994846/
https://www.ncbi.nlm.nih.gov/pubmed/23999671
http://dx.doi.org/10.1136/amiajnl-2012-001563
work_keys_str_mv AT jungjaeyoon aliteraturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT delucatoddf aliteraturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT nelsontristanh aliteraturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT walldennisp aliteraturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT jungjaeyoon literaturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT delucatoddf literaturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT nelsontristanh literaturesearchtoolforintelligentextractionofdiseaseassociatedgenes
AT walldennisp literaturesearchtoolforintelligentextractionofdiseaseassociatedgenes