Cargando…

Automatic extraction of biomolecular interactions: an empirical approach

BACKGROUND: We describe a method for extracting data about how biomolecule pairs interact from texts. This method relies on empirically determined characteristics of sentences. The characteristics are efficient to compute, making this approach to extraction of biomolecular interactions scalable. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lifeng, Berleant, Daniel, Ding, Jing, Wurtele, Eve Syrkin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3729816/
https://www.ncbi.nlm.nih.gov/pubmed/23883165
http://dx.doi.org/10.1186/1471-2105-14-234
_version_ 1782278995647135744
author Zhang, Lifeng
Berleant, Daniel
Ding, Jing
Wurtele, Eve Syrkin
author_facet Zhang, Lifeng
Berleant, Daniel
Ding, Jing
Wurtele, Eve Syrkin
author_sort Zhang, Lifeng
collection PubMed
description BACKGROUND: We describe a method for extracting data about how biomolecule pairs interact from texts. This method relies on empirically determined characteristics of sentences. The characteristics are efficient to compute, making this approach to extraction of biomolecular interactions scalable. The results of such interaction mining can support interaction network annotation, question answering, database construction, and other applications. RESULTS: We constructed a software system to search MEDLINE for sentences likely to describe interactions between given biomolecules. The system extracts a list of the interaction-indicating terms appearing in those sentences, then ranks those terms based on their likelihood of correctly characterizing how the biomolecules interact. The ranking process uses a tf-idf (term frequency–inverse document frequency) based technique using empirically derived knowledge about sentences, and was applied to the MEDLINE literature collection. Software was developed as part of the MetNet toolkit (http://www.metnetdb.org). CONCLUSIONS: Specific, efficiently computable characteristics of sentences about biomolecular interactions were analyzed to better understand how to use these characteristics to extract how biomolecules interact. The text empirics method that was investigated, though arising from a classical tradition, has yet to be fully explored for the task of extracting biomolecular interactions from the literature. The conclusions we reach about the sentence characteristics investigated in this work, as well as the technique itself, could be used by other systems to provide evidence about putative interactions, thus supporting efforts to maximize the ability of hybrid systems to support such tasks as annotating and constructing interaction networks.
format Online
Article
Text
id pubmed-3729816
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37298162013-08-01 Automatic extraction of biomolecular interactions: an empirical approach Zhang, Lifeng Berleant, Daniel Ding, Jing Wurtele, Eve Syrkin BMC Bioinformatics Research Article BACKGROUND: We describe a method for extracting data about how biomolecule pairs interact from texts. This method relies on empirically determined characteristics of sentences. The characteristics are efficient to compute, making this approach to extraction of biomolecular interactions scalable. The results of such interaction mining can support interaction network annotation, question answering, database construction, and other applications. RESULTS: We constructed a software system to search MEDLINE for sentences likely to describe interactions between given biomolecules. The system extracts a list of the interaction-indicating terms appearing in those sentences, then ranks those terms based on their likelihood of correctly characterizing how the biomolecules interact. The ranking process uses a tf-idf (term frequency–inverse document frequency) based technique using empirically derived knowledge about sentences, and was applied to the MEDLINE literature collection. Software was developed as part of the MetNet toolkit (http://www.metnetdb.org). CONCLUSIONS: Specific, efficiently computable characteristics of sentences about biomolecular interactions were analyzed to better understand how to use these characteristics to extract how biomolecules interact. The text empirics method that was investigated, though arising from a classical tradition, has yet to be fully explored for the task of extracting biomolecular interactions from the literature. The conclusions we reach about the sentence characteristics investigated in this work, as well as the technique itself, could be used by other systems to provide evidence about putative interactions, thus supporting efforts to maximize the ability of hybrid systems to support such tasks as annotating and constructing interaction networks. BioMed Central 2013-07-24 /pmc/articles/PMC3729816/ /pubmed/23883165 http://dx.doi.org/10.1186/1471-2105-14-234 Text en Copyright © 2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Lifeng
Berleant, Daniel
Ding, Jing
Wurtele, Eve Syrkin
Automatic extraction of biomolecular interactions: an empirical approach
title Automatic extraction of biomolecular interactions: an empirical approach
title_full Automatic extraction of biomolecular interactions: an empirical approach
title_fullStr Automatic extraction of biomolecular interactions: an empirical approach
title_full_unstemmed Automatic extraction of biomolecular interactions: an empirical approach
title_short Automatic extraction of biomolecular interactions: an empirical approach
title_sort automatic extraction of biomolecular interactions: an empirical approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3729816/
https://www.ncbi.nlm.nih.gov/pubmed/23883165
http://dx.doi.org/10.1186/1471-2105-14-234
work_keys_str_mv AT zhanglifeng automaticextractionofbiomolecularinteractionsanempiricalapproach
AT berleantdaniel automaticextractionofbiomolecularinteractionsanempiricalapproach
AT dingjing automaticextractionofbiomolecularinteractionsanempiricalapproach
AT wurteleevesyrkin automaticextractionofbiomolecularinteractionsanempiricalapproach