Cargando…
An integrated text mining framework for metabolic interaction network reconstruction
Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valua...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4806637/ https://www.ncbi.nlm.nih.gov/pubmed/27019783 http://dx.doi.org/10.7717/peerj.1811 |
_version_ | 1782423268911742976 |
---|---|
author | Patumcharoenpol, Preecha Doungpan, Narumol Meechai, Asawin Shen, Bairong Chan, Jonathan H. Vongsangnak, Wanwipa |
author_facet | Patumcharoenpol, Preecha Doungpan, Narumol Meechai, Asawin Shen, Bairong Chan, Jonathan H. Vongsangnak, Wanwipa |
author_sort | Patumcharoenpol, Preecha |
collection | PubMed |
description | Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valuable to apply TM to the extraction of metabolic interactions (i.e., enzyme and metabolite interactions) through metabolic events. Here we present an integrated TM framework containing two modules for the extraction of metabolic events (Metabolic Event Extraction module—MEE) and for the construction of a metabolic interaction network (Metabolic Interaction Network Reconstruction module—MINR). The proposed integrated TM framework performed well based on standard measures of recall, precision and F-score. Evaluation of the MEE module using the constructed Metabolic Entities (ME) corpus yielded F-scores of 59.15% and 48.59% for the detection of metabolic events for production and consumption, respectively. As for the testing of the entity tagger for Gene and Protein (GP) and metabolite with the test corpus, the obtained F-score was greater than 80% for the Superpathway of leucine, valine, and isoleucine biosynthesis. Mapping of enzyme and metabolite interactions through network reconstruction showed a fair performance for the MINR module on the test corpus with F-score >70%. Finally, an application of our integrated TM framework on a big-scale data (i.e., EcoCyc extraction data) for reconstructing a metabolic interaction network showed reasonable precisions at 69.93%, 70.63% and 46.71% for enzyme, metabolite and enzyme–metabolite interaction, respectively. This study presents the first open-source integrated TM framework for reconstructing a metabolic interaction network. This framework can be a powerful tool that helps biologists to extract metabolic events for further reconstruction of a metabolic interaction network. The ME corpus, test corpus, source code, and virtual machine image with pre-configured software are available at www.sbi.kmutt.ac.th/ preecha/metrecon. |
format | Online Article Text |
id | pubmed-4806637 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-48066372016-03-25 An integrated text mining framework for metabolic interaction network reconstruction Patumcharoenpol, Preecha Doungpan, Narumol Meechai, Asawin Shen, Bairong Chan, Jonathan H. Vongsangnak, Wanwipa PeerJ Bioinformatics Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valuable to apply TM to the extraction of metabolic interactions (i.e., enzyme and metabolite interactions) through metabolic events. Here we present an integrated TM framework containing two modules for the extraction of metabolic events (Metabolic Event Extraction module—MEE) and for the construction of a metabolic interaction network (Metabolic Interaction Network Reconstruction module—MINR). The proposed integrated TM framework performed well based on standard measures of recall, precision and F-score. Evaluation of the MEE module using the constructed Metabolic Entities (ME) corpus yielded F-scores of 59.15% and 48.59% for the detection of metabolic events for production and consumption, respectively. As for the testing of the entity tagger for Gene and Protein (GP) and metabolite with the test corpus, the obtained F-score was greater than 80% for the Superpathway of leucine, valine, and isoleucine biosynthesis. Mapping of enzyme and metabolite interactions through network reconstruction showed a fair performance for the MINR module on the test corpus with F-score >70%. Finally, an application of our integrated TM framework on a big-scale data (i.e., EcoCyc extraction data) for reconstructing a metabolic interaction network showed reasonable precisions at 69.93%, 70.63% and 46.71% for enzyme, metabolite and enzyme–metabolite interaction, respectively. This study presents the first open-source integrated TM framework for reconstructing a metabolic interaction network. This framework can be a powerful tool that helps biologists to extract metabolic events for further reconstruction of a metabolic interaction network. The ME corpus, test corpus, source code, and virtual machine image with pre-configured software are available at www.sbi.kmutt.ac.th/ preecha/metrecon. PeerJ Inc. 2016-03-21 /pmc/articles/PMC4806637/ /pubmed/27019783 http://dx.doi.org/10.7717/peerj.1811 Text en ©2016 Patumcharoenpol et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Patumcharoenpol, Preecha Doungpan, Narumol Meechai, Asawin Shen, Bairong Chan, Jonathan H. Vongsangnak, Wanwipa An integrated text mining framework for metabolic interaction network reconstruction |
title | An integrated text mining framework for metabolic interaction network reconstruction |
title_full | An integrated text mining framework for metabolic interaction network reconstruction |
title_fullStr | An integrated text mining framework for metabolic interaction network reconstruction |
title_full_unstemmed | An integrated text mining framework for metabolic interaction network reconstruction |
title_short | An integrated text mining framework for metabolic interaction network reconstruction |
title_sort | integrated text mining framework for metabolic interaction network reconstruction |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4806637/ https://www.ncbi.nlm.nih.gov/pubmed/27019783 http://dx.doi.org/10.7717/peerj.1811 |
work_keys_str_mv | AT patumcharoenpolpreecha anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT doungpannarumol anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT meechaiasawin anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT shenbairong anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT chanjonathanh anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT vongsangnakwanwipa anintegratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT patumcharoenpolpreecha integratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT doungpannarumol integratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT meechaiasawin integratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT shenbairong integratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT chanjonathanh integratedtextminingframeworkformetabolicinteractionnetworkreconstruction AT vongsangnakwanwipa integratedtextminingframeworkformetabolicinteractionnetworkreconstruction |