Cargando…

How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience

There is an increasing interest in developing ontologies and controlled vocabularies to improve the efficiency and consistency of manual literature curation, to enable more formal biocuration workflow results and ultimately to improve analysis of biological data. Two ontologies that have been succes...

Descripción completa

Detalles Bibliográficos
Autores principales: Krallinger, Martin, Leitner, Florian, Vazquez, Miguel, Salgado, David, Marcelle, Christophe, Tyers, Mike, Valencia, Alfonso, Chatr-aryamontri, Andrew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3309177/
https://www.ncbi.nlm.nih.gov/pubmed/22438567
http://dx.doi.org/10.1093/database/bas017
_version_ 1782227488700628992
author Krallinger, Martin
Leitner, Florian
Vazquez, Miguel
Salgado, David
Marcelle, Christophe
Tyers, Mike
Valencia, Alfonso
Chatr-aryamontri, Andrew
author_facet Krallinger, Martin
Leitner, Florian
Vazquez, Miguel
Salgado, David
Marcelle, Christophe
Tyers, Mike
Valencia, Alfonso
Chatr-aryamontri, Andrew
author_sort Krallinger, Martin
collection PubMed
description There is an increasing interest in developing ontologies and controlled vocabularies to improve the efficiency and consistency of manual literature curation, to enable more formal biocuration workflow results and ultimately to improve analysis of biological data. Two ontologies that have been successfully used for this purpose are the Gene Ontology (GO) for annotating aspects of gene products and the Molecular Interaction ontology (PSI-MI) used by databases that archive protein–protein interactions. The examination of protein interactions has proven to be extremely promising for the understanding of cellular processes. Manual mapping of information from the biomedical literature to bio-ontology terms is one of the most challenging components in the curation pipeline. It requires that expert curators interpret the natural language descriptions contained in articles and infer their semantic equivalents in the ontology (controlled vocabulary). Since manual curation is a time-consuming process, there is strong motivation to implement text-mining techniques to automatically extract annotations from free text. A range of text mining strategies has been devised to assist in the automated extraction of biological data. These strategies either recognize technical terms used recurrently in the literature and propose them as candidates for inclusion in ontologies, or retrieve passages that serve as evidential support for annotating an ontology term, e.g. from the PSI-MI or GO controlled vocabularies. Here, we provide a general overview of current text-mining methods to automatically extract annotations of GO and PSI-MI ontology terms in the context of the BioCreative (Critical Assessment of Information Extraction Systems in Biology) challenge. Special emphasis is given to protein–protein interaction data and PSI-MI terms referring to interaction detection methods.
format Online
Article
Text
id pubmed-3309177
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33091772012-03-21 How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience Krallinger, Martin Leitner, Florian Vazquez, Miguel Salgado, David Marcelle, Christophe Tyers, Mike Valencia, Alfonso Chatr-aryamontri, Andrew Database (Oxford) Original Article There is an increasing interest in developing ontologies and controlled vocabularies to improve the efficiency and consistency of manual literature curation, to enable more formal biocuration workflow results and ultimately to improve analysis of biological data. Two ontologies that have been successfully used for this purpose are the Gene Ontology (GO) for annotating aspects of gene products and the Molecular Interaction ontology (PSI-MI) used by databases that archive protein–protein interactions. The examination of protein interactions has proven to be extremely promising for the understanding of cellular processes. Manual mapping of information from the biomedical literature to bio-ontology terms is one of the most challenging components in the curation pipeline. It requires that expert curators interpret the natural language descriptions contained in articles and infer their semantic equivalents in the ontology (controlled vocabulary). Since manual curation is a time-consuming process, there is strong motivation to implement text-mining techniques to automatically extract annotations from free text. A range of text mining strategies has been devised to assist in the automated extraction of biological data. These strategies either recognize technical terms used recurrently in the literature and propose them as candidates for inclusion in ontologies, or retrieve passages that serve as evidential support for annotating an ontology term, e.g. from the PSI-MI or GO controlled vocabularies. Here, we provide a general overview of current text-mining methods to automatically extract annotations of GO and PSI-MI ontology terms in the context of the BioCreative (Critical Assessment of Information Extraction Systems in Biology) challenge. Special emphasis is given to protein–protein interaction data and PSI-MI terms referring to interaction detection methods. Oxford University Press 2012-03-21 /pmc/articles/PMC3309177/ /pubmed/22438567 http://dx.doi.org/10.1093/database/bas017 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Krallinger, Martin
Leitner, Florian
Vazquez, Miguel
Salgado, David
Marcelle, Christophe
Tyers, Mike
Valencia, Alfonso
Chatr-aryamontri, Andrew
How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience
title How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience
title_full How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience
title_fullStr How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience
title_full_unstemmed How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience
title_short How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience
title_sort how to link ontologies and protein–protein interactions to literature: text-mining approaches and the biocreative experience
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3309177/
https://www.ncbi.nlm.nih.gov/pubmed/22438567
http://dx.doi.org/10.1093/database/bas017
work_keys_str_mv AT krallingermartin howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT leitnerflorian howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT vazquezmiguel howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT salgadodavid howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT marcellechristophe howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT tyersmike howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT valenciaalfonso howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience
AT chatraryamontriandrew howtolinkontologiesandproteinproteininteractionstoliteraturetextminingapproachesandthebiocreativeexperience