Cargando…

Automated extraction and semantic analysis of mutation impacts from the biomedical literature

BACKGROUND: Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the...

Descripción completa

Detalles Bibliográficos
Autores principales: Naderi, Nona, Witte, René
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395893/
https://www.ncbi.nlm.nih.gov/pubmed/22759648
http://dx.doi.org/10.1186/1471-2164-13-S4-S10
_version_ 1782238054613778432
author Naderi, Nona
Witte, René
author_facet Naderi, Nona
Witte, René
author_sort Naderi, Nona
collection PubMed
description BACKGROUND: Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities. RESULTS: We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner. CONCLUSION: We present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions.
format Online
Article
Text
id pubmed-3395893
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33958932012-07-16 Automated extraction and semantic analysis of mutation impacts from the biomedical literature Naderi, Nona Witte, René BMC Genomics Proceedings BACKGROUND: Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities. RESULTS: We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner. CONCLUSION: We present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions. BioMed Central 2012-06-18 /pmc/articles/PMC3395893/ /pubmed/22759648 http://dx.doi.org/10.1186/1471-2164-13-S4-S10 Text en Copyright ©2012 Naderi and Witte; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Naderi, Nona
Witte, René
Automated extraction and semantic analysis of mutation impacts from the biomedical literature
title Automated extraction and semantic analysis of mutation impacts from the biomedical literature
title_full Automated extraction and semantic analysis of mutation impacts from the biomedical literature
title_fullStr Automated extraction and semantic analysis of mutation impacts from the biomedical literature
title_full_unstemmed Automated extraction and semantic analysis of mutation impacts from the biomedical literature
title_short Automated extraction and semantic analysis of mutation impacts from the biomedical literature
title_sort automated extraction and semantic analysis of mutation impacts from the biomedical literature
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395893/
https://www.ncbi.nlm.nih.gov/pubmed/22759648
http://dx.doi.org/10.1186/1471-2164-13-S4-S10
work_keys_str_mv AT naderinona automatedextractionandsemanticanalysisofmutationimpactsfromthebiomedicalliterature
AT witterene automatedextractionandsemanticanalysisofmutationimpactsfromthebiomedicalliterature