Cargando…
Construction of biological networks from unstructured information based on a semi-automated curation workflow
Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630939/ https://www.ncbi.nlm.nih.gov/pubmed/26200752 http://dx.doi.org/10.1093/database/bav057 |
_version_ | 1783269329811275776 |
---|---|
author | Szostak, Justyna Ansari, Sam Madan, Sumit Fluck, Juliane Talikka, Marja Iskandar, Anita De Leon, Hector Hofmann-Apitius, Martin Peitsch, Manuel C. Hoeng, Julia |
author_facet | Szostak, Justyna Ansari, Sam Madan, Sumit Fluck, Juliane Talikka, Marja Iskandar, Anita De Leon, Hector Hofmann-Apitius, Martin Peitsch, Manuel C. Hoeng, Julia |
author_sort | Szostak, Justyna |
collection | PubMed |
description | Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE (−/−) ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edges supported by 33 PubMed referenced articles. A comparison between the semi-automated and conventional curation processes showed similar results, but significantly reduced curation effort for the semi-automated process. Creating structured knowledge from unstructured text is an important step for the mechanistic interpretation and reusability of knowledge. Our new semi-automated knowledge extraction workflow reduced the curation skills and effort required to capture and organize scientific knowledge. The atherosclerotic plaque destabilization network that was generated is a causal network model for vascular disease demonstrating the usefulness of the workflow for knowledge extraction and construction of mechanistically meaningful biological networks. |
format | Online Article Text |
id | pubmed-5630939 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-56309392017-11-16 Construction of biological networks from unstructured information based on a semi-automated curation workflow Szostak, Justyna Ansari, Sam Madan, Sumit Fluck, Juliane Talikka, Marja Iskandar, Anita De Leon, Hector Hofmann-Apitius, Martin Peitsch, Manuel C. Hoeng, Julia Database (Oxford) Original Article Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE (−/−) ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edges supported by 33 PubMed referenced articles. A comparison between the semi-automated and conventional curation processes showed similar results, but significantly reduced curation effort for the semi-automated process. Creating structured knowledge from unstructured text is an important step for the mechanistic interpretation and reusability of knowledge. Our new semi-automated knowledge extraction workflow reduced the curation skills and effort required to capture and organize scientific knowledge. The atherosclerotic plaque destabilization network that was generated is a causal network model for vascular disease demonstrating the usefulness of the workflow for knowledge extraction and construction of mechanistically meaningful biological networks. Oxford University Press 2015-06-16 /pmc/articles/PMC5630939/ /pubmed/26200752 http://dx.doi.org/10.1093/database/bav057 Text en © The Author(s) 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Szostak, Justyna Ansari, Sam Madan, Sumit Fluck, Juliane Talikka, Marja Iskandar, Anita De Leon, Hector Hofmann-Apitius, Martin Peitsch, Manuel C. Hoeng, Julia Construction of biological networks from unstructured information based on a semi-automated curation workflow |
title | Construction of biological networks from unstructured information based on a semi-automated curation workflow |
title_full | Construction of biological networks from unstructured information based on a semi-automated curation workflow |
title_fullStr | Construction of biological networks from unstructured information based on a semi-automated curation workflow |
title_full_unstemmed | Construction of biological networks from unstructured information based on a semi-automated curation workflow |
title_short | Construction of biological networks from unstructured information based on a semi-automated curation workflow |
title_sort | construction of biological networks from unstructured information based on a semi-automated curation workflow |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630939/ https://www.ncbi.nlm.nih.gov/pubmed/26200752 http://dx.doi.org/10.1093/database/bav057 |
work_keys_str_mv | AT szostakjustyna constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT ansarisam constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT madansumit constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT fluckjuliane constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT talikkamarja constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT iskandaranita constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT deleonhector constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT hofmannapitiusmartin constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT peitschmanuelc constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow AT hoengjulia constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow |