Cargando…

iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature

Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programmin...

Descripción completa

Detalles Bibliográficos
Autores principales: Ren, Jia, Li, Gang, Ross, Karen, Arighi, Cecilia, McGarvey, Peter, Rao, Shruti, Cowart, Julie, Madhavan, Subha, Vijay-Shanker, K, Wu, Cathy H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6301332/
https://www.ncbi.nlm.nih.gov/pubmed/30576489
http://dx.doi.org/10.1093/database/bay128
_version_ 1783381818255343616
author Ren, Jia
Li, Gang
Ross, Karen
Arighi, Cecilia
McGarvey, Peter
Rao, Shruti
Cowart, Julie
Madhavan, Subha
Vijay-Shanker, K
Wu, Cathy H
author_facet Ren, Jia
Li, Gang
Ross, Karen
Arighi, Cecilia
McGarvey, Peter
Rao, Shruti
Cowart, Julie
Madhavan, Subha
Vijay-Shanker, K
Wu, Cathy H
author_sort Ren, Jia
collection PubMed
description Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programming language, system dependency and input/output format. There are few previous works that concern the integration of different text-mining tools and their results from large-scale text processing. In this paper, we describe the iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction. We employ parallel processing with dockerized text-mining tools with a standardized JSON output format and implement a text alignment algorithm to solve the text discrepancy for result integration. iTextMine presently integrates four relation extraction tools, which have been used to process all the Medline abstracts and PMC open access full-length articles. The website allows users to browse the text evidence and view integrated results for knowledge discovery through a network view. We demonstrate the utilities of iTextMine with two use cases involving the gene PTEN and breast cancer and the gene SATB1.
format Online
Article
Text
id pubmed-6301332
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63013322018-12-27 iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature Ren, Jia Li, Gang Ross, Karen Arighi, Cecilia McGarvey, Peter Rao, Shruti Cowart, Julie Madhavan, Subha Vijay-Shanker, K Wu, Cathy H Database (Oxford) Original Article Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programming language, system dependency and input/output format. There are few previous works that concern the integration of different text-mining tools and their results from large-scale text processing. In this paper, we describe the iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction. We employ parallel processing with dockerized text-mining tools with a standardized JSON output format and implement a text alignment algorithm to solve the text discrepancy for result integration. iTextMine presently integrates four relation extraction tools, which have been used to process all the Medline abstracts and PMC open access full-length articles. The website allows users to browse the text evidence and view integrated results for knowledge discovery through a network view. We demonstrate the utilities of iTextMine with two use cases involving the gene PTEN and breast cancer and the gene SATB1. Oxford University Press 2018-12-14 /pmc/articles/PMC6301332/ /pubmed/30576489 http://dx.doi.org/10.1093/database/bay128 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Ren, Jia
Li, Gang
Ross, Karen
Arighi, Cecilia
McGarvey, Peter
Rao, Shruti
Cowart, Julie
Madhavan, Subha
Vijay-Shanker, K
Wu, Cathy H
iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
title iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
title_full iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
title_fullStr iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
title_full_unstemmed iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
title_short iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
title_sort itextmine: integrated text-mining system for large-scale knowledge extraction from the literature
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6301332/
https://www.ncbi.nlm.nih.gov/pubmed/30576489
http://dx.doi.org/10.1093/database/bay128
work_keys_str_mv AT renjia itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT ligang itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT rosskaren itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT arighicecilia itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT mcgarveypeter itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT raoshruti itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT cowartjulie itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT madhavansubha itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT vijayshankerk itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature
AT wucathyh itextmineintegratedtextminingsystemforlargescaleknowledgeextractionfromtheliterature