Cargando…

An integrated text mining framework for metabolic interaction network reconstruction

Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valua...

Descripción completa

Detalles Bibliográficos
Autores principales: Patumcharoenpol, Preecha, Doungpan, Narumol, Meechai, Asawin, Shen, Bairong, Chan, Jonathan H., Vongsangnak, Wanwipa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4806637/
https://www.ncbi.nlm.nih.gov/pubmed/27019783
http://dx.doi.org/10.7717/peerj.1811
_version_ 1782423268911742976
author Patumcharoenpol, Preecha
Doungpan, Narumol
Meechai, Asawin
Shen, Bairong
Chan, Jonathan H.
Vongsangnak, Wanwipa
author_facet Patumcharoenpol, Preecha
Doungpan, Narumol
Meechai, Asawin
Shen, Bairong
Chan, Jonathan H.
Vongsangnak, Wanwipa
author_sort Patumcharoenpol, Preecha
collection PubMed
description Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valuable to apply TM to the extraction of metabolic interactions (i.e., enzyme and metabolite interactions) through metabolic events. Here we present an integrated TM framework containing two modules for the extraction of metabolic events (Metabolic Event Extraction module—MEE) and for the construction of a metabolic interaction network (Metabolic Interaction Network Reconstruction module—MINR). The proposed integrated TM framework performed well based on standard measures of recall, precision and F-score. Evaluation of the MEE module using the constructed Metabolic Entities (ME) corpus yielded F-scores of 59.15% and 48.59% for the detection of metabolic events for production and consumption, respectively. As for the testing of the entity tagger for Gene and Protein (GP) and metabolite with the test corpus, the obtained F-score was greater than 80% for the Superpathway of leucine, valine, and isoleucine biosynthesis. Mapping of enzyme and metabolite interactions through network reconstruction showed a fair performance for the MINR module on the test corpus with F-score >70%. Finally, an application of our integrated TM framework on a big-scale data (i.e., EcoCyc extraction data) for reconstructing a metabolic interaction network showed reasonable precisions at 69.93%, 70.63% and 46.71% for enzyme, metabolite and enzyme–metabolite interaction, respectively. This study presents the first open-source integrated TM framework for reconstructing a metabolic interaction network. This framework can be a powerful tool that helps biologists to extract metabolic events for further reconstruction of a metabolic interaction network. The ME corpus, test corpus, source code, and virtual machine image with pre-configured software are available at www.sbi.kmutt.ac.th/ preecha/metrecon.
format Online
Article
Text
id pubmed-4806637
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-48066372016-03-25 An integrated text mining framework for metabolic interaction network reconstruction Patumcharoenpol, Preecha Doungpan, Narumol Meechai, Asawin Shen, Bairong Chan, Jonathan H. Vongsangnak, Wanwipa PeerJ Bioinformatics Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valuable to apply TM to the extraction of metabolic interactions (i.e., enzyme and metabolite interactions) through metabolic events. Here we present an integrated TM framework containing two modules for the extraction of metabolic events (Metabolic Event Extraction module—MEE) and for the construction of a metabolic interaction network (Metabolic Interaction Network Reconstruction module—MINR). The proposed integrated TM framework performed well based on standard measures of recall, precision and F-score. Evaluation of the MEE module using the constructed Metabolic Entities (ME) corpus yielded F-scores of 59.15% and 48.59% for the detection of metabolic events for production and consumption, respectively. As for the testing of the entity tagger for Gene and Protein (GP) and metabolite with the test corpus, the obtained F-score was greater than 80% for the Superpathway of leucine, valine, and isoleucine biosynthesis. Mapping of enzyme and metabolite interactions through network reconstruction showed a fair performance for the MINR module on the test corpus with F-score >70%. Finally, an application of our integrated TM framework on a big-scale data (i.e., EcoCyc extraction data) for reconstructing a metabolic interaction network showed reasonable precisions at 69.93%, 70.63% and 46.71% for enzyme, metabolite and enzyme–metabolite interaction, respectively. This study presents the first open-source integrated TM framework for reconstructing a metabolic interaction network. This framework can be a powerful tool that helps biologists to extract metabolic events for further reconstruction of a metabolic interaction network. The ME corpus, test corpus, source code, and virtual machine image with pre-configured software are available at www.sbi.kmutt.ac.th/ preecha/metrecon. PeerJ Inc. 2016-03-21 /pmc/articles/PMC4806637/ /pubmed/27019783 http://dx.doi.org/10.7717/peerj.1811 Text en ©2016 Patumcharoenpol et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Patumcharoenpol, Preecha
Doungpan, Narumol
Meechai, Asawin
Shen, Bairong
Chan, Jonathan H.
Vongsangnak, Wanwipa
An integrated text mining framework for metabolic interaction network reconstruction
title An integrated text mining framework for metabolic interaction network reconstruction
title_full An integrated text mining framework for metabolic interaction network reconstruction
title_fullStr An integrated text mining framework for metabolic interaction network reconstruction
title_full_unstemmed An integrated text mining framework for metabolic interaction network reconstruction
title_short An integrated text mining framework for metabolic interaction network reconstruction
title_sort integrated text mining framework for metabolic interaction network reconstruction
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4806637/
https://www.ncbi.nlm.nih.gov/pubmed/27019783
http://dx.doi.org/10.7717/peerj.1811
work_keys_str_mv AT patumcharoenpolpreecha anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT doungpannarumol anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT meechaiasawin anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT shenbairong anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT chanjonathanh anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT vongsangnakwanwipa anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT patumcharoenpolpreecha integratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT doungpannarumol integratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT meechaiasawin integratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT shenbairong integratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT chanjonathanh integratedtextminingframeworkformetabolicinteractionnetworkreconstruction
AT vongsangnakwanwipa integratedtextminingframeworkformetabolicinteractionnetworkreconstruction