Cargando…

Construction of biological networks from unstructured information based on a semi-automated curation workflow

Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A...

Descripción completa

Detalles Bibliográficos
Autores principales: Szostak, Justyna, Ansari, Sam, Madan, Sumit, Fluck, Juliane, Talikka, Marja, Iskandar, Anita, De Leon, Hector, Hofmann-Apitius, Martin, Peitsch, Manuel C., Hoeng, Julia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630939/
https://www.ncbi.nlm.nih.gov/pubmed/26200752
http://dx.doi.org/10.1093/database/bav057
_version_ 1783269329811275776
author Szostak, Justyna
Ansari, Sam
Madan, Sumit
Fluck, Juliane
Talikka, Marja
Iskandar, Anita
De Leon, Hector
Hofmann-Apitius, Martin
Peitsch, Manuel C.
Hoeng, Julia
author_facet Szostak, Justyna
Ansari, Sam
Madan, Sumit
Fluck, Juliane
Talikka, Marja
Iskandar, Anita
De Leon, Hector
Hofmann-Apitius, Martin
Peitsch, Manuel C.
Hoeng, Julia
author_sort Szostak, Justyna
collection PubMed
description Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE (−/−) ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edges supported by 33 PubMed referenced articles. A comparison between the semi-automated and conventional curation processes showed similar results, but significantly reduced curation effort for the semi-automated process. Creating structured knowledge from unstructured text is an important step for the mechanistic interpretation and reusability of knowledge. Our new semi-automated knowledge extraction workflow reduced the curation skills and effort required to capture and organize scientific knowledge. The atherosclerotic plaque destabilization network that was generated is a causal network model for vascular disease demonstrating the usefulness of the workflow for knowledge extraction and construction of mechanistically meaningful biological networks.
format Online
Article
Text
id pubmed-5630939
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-56309392017-11-16 Construction of biological networks from unstructured information based on a semi-automated curation workflow Szostak, Justyna Ansari, Sam Madan, Sumit Fluck, Juliane Talikka, Marja Iskandar, Anita De Leon, Hector Hofmann-Apitius, Martin Peitsch, Manuel C. Hoeng, Julia Database (Oxford) Original Article Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE (−/−) ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edges supported by 33 PubMed referenced articles. A comparison between the semi-automated and conventional curation processes showed similar results, but significantly reduced curation effort for the semi-automated process. Creating structured knowledge from unstructured text is an important step for the mechanistic interpretation and reusability of knowledge. Our new semi-automated knowledge extraction workflow reduced the curation skills and effort required to capture and organize scientific knowledge. The atherosclerotic plaque destabilization network that was generated is a causal network model for vascular disease demonstrating the usefulness of the workflow for knowledge extraction and construction of mechanistically meaningful biological networks. Oxford University Press 2015-06-16 /pmc/articles/PMC5630939/ /pubmed/26200752 http://dx.doi.org/10.1093/database/bav057 Text en © The Author(s) 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Szostak, Justyna
Ansari, Sam
Madan, Sumit
Fluck, Juliane
Talikka, Marja
Iskandar, Anita
De Leon, Hector
Hofmann-Apitius, Martin
Peitsch, Manuel C.
Hoeng, Julia
Construction of biological networks from unstructured information based on a semi-automated curation workflow
title Construction of biological networks from unstructured information based on a semi-automated curation workflow
title_full Construction of biological networks from unstructured information based on a semi-automated curation workflow
title_fullStr Construction of biological networks from unstructured information based on a semi-automated curation workflow
title_full_unstemmed Construction of biological networks from unstructured information based on a semi-automated curation workflow
title_short Construction of biological networks from unstructured information based on a semi-automated curation workflow
title_sort construction of biological networks from unstructured information based on a semi-automated curation workflow
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630939/
https://www.ncbi.nlm.nih.gov/pubmed/26200752
http://dx.doi.org/10.1093/database/bav057
work_keys_str_mv AT szostakjustyna constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT ansarisam constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT madansumit constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT fluckjuliane constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT talikkamarja constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT iskandaranita constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT deleonhector constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT hofmannapitiusmartin constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT peitschmanuelc constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow
AT hoengjulia constructionofbiologicalnetworksfromunstructuredinformationbasedonasemiautomatedcurationworkflow