Cargando…

Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

BACKGROUND: Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as...

Descripción completa

Detalles Bibliográficos
Autores principales: Ratkovic, Zorana, Golik, Wiktoria, Warnier, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384252/
https://www.ncbi.nlm.nih.gov/pubmed/22759462
http://dx.doi.org/10.1186/1471-2105-13-S11-S8
_version_ 1782236683583881216
author Ratkovic, Zorana
Golik, Wiktoria
Warnier, Pierre
author_facet Ratkovic, Zorana
Golik, Wiktoria
Warnier, Pierre
author_sort Ratkovic, Zorana
collection PubMed
description BACKGROUND: Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. METHODS: We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. RESULTS: We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. CONCLUSIONS: We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.
format Online
Article
Text
id pubmed-3384252
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33842522012-06-28 Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach Ratkovic, Zorana Golik, Wiktoria Warnier, Pierre BMC Bioinformatics Proceedings BACKGROUND: Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. METHODS: We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. RESULTS: We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. CONCLUSIONS: We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data. BioMed Central 2012-06-26 /pmc/articles/PMC3384252/ /pubmed/22759462 http://dx.doi.org/10.1186/1471-2105-13-S11-S8 Text en Copyright ©2012 Ratkovic et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Ratkovic, Zorana
Golik, Wiktoria
Warnier, Pierre
Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
title Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
title_full Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
title_fullStr Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
title_full_unstemmed Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
title_short Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
title_sort event extraction of bacteria biotopes: a knowledge-intensive nlp-based approach
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384252/
https://www.ncbi.nlm.nih.gov/pubmed/22759462
http://dx.doi.org/10.1186/1471-2105-13-S11-S8
work_keys_str_mv AT ratkoviczorana eventextractionofbacteriabiotopesaknowledgeintensivenlpbasedapproach
AT golikwiktoria eventextractionofbacteriabiotopesaknowledgeintensivenlpbasedapproach
AT warnierpierre eventextractionofbacteriabiotopesaknowledgeintensivenlpbasedapproach