Cargando…

The eFIP system for text mining of protein interaction networks of phosphorylated proteins

Protein phosphorylation is a central regulatory mechanism in signal transduction involved in most biological processes. Phosphorylation of a protein may lead to activation or repression of its activity, alternative subcellular location and interaction with different binding partners. Extracting this...

Descripción completa

Detalles Bibliográficos
Autores principales: Tudor, Catalina O., Arighi, Cecilia N., Wang, Qinghua, Wu, Cathy H., Vijay-Shanker, K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3514748/
https://www.ncbi.nlm.nih.gov/pubmed/23221174
http://dx.doi.org/10.1093/database/bas044
_version_ 1782252072884764672
author Tudor, Catalina O.
Arighi, Cecilia N.
Wang, Qinghua
Wu, Cathy H.
Vijay-Shanker, K.
author_facet Tudor, Catalina O.
Arighi, Cecilia N.
Wang, Qinghua
Wu, Cathy H.
Vijay-Shanker, K.
author_sort Tudor, Catalina O.
collection PubMed
description Protein phosphorylation is a central regulatory mechanism in signal transduction involved in most biological processes. Phosphorylation of a protein may lead to activation or repression of its activity, alternative subcellular location and interaction with different binding partners. Extracting this type of information from scientific literature is critical for connecting phosphorylated proteins with kinases and interaction partners, along with their functional outcomes, for knowledge discovery from phosphorylation protein networks. We have developed the Extracting Functional Impact of Phosphorylation (eFIP) text mining system, which combines several natural language processing techniques to find relevant abstracts mentioning phosphorylation of a given protein together with indications of protein–protein interactions (PPIs) and potential evidences for impact of phosphorylation on the PPIs. eFIP integrates our previously developed tools, Extracting Gene Related ABstracts (eGRAB) for document retrieval and name disambiguation, Rule-based LIterature Mining System (RLIMS-P) for Protein Phosphorylation for extraction of phosphorylation information, a PPI module to detect PPIs involving phosphorylated proteins and an impact module for relation extraction. The text mining system has been integrated into the curation workflow of the Protein Ontology (PRO) to capture knowledge about phosphorylated proteins. The eFIP web interface accepts gene/protein names or identifiers, or PubMed identifiers as input, and displays results as a ranked list of abstracts with sentence evidence and summary table, which can be exported in a spreadsheet upon result validation. As a participant in the BioCreative-2012 Interactive Text Mining track, the performance of eFIP was evaluated on document retrieval (F-measures of 78–100%), sentence-level information extraction (F-measures of 70–80%) and document ranking (normalized discounted cumulative gain measures of 93–100% and mean average precision of 0.86). The utility and usability of the eFIP web interface were also evaluated during the BioCreative Workshop. The use of the eFIP interface provided a significant speed-up (∼2.5-fold) for time to completion of the curation task. Additionally, eFIP significantly simplifies the task of finding relevant articles on PPI involving phosphorylated forms of a given protein. Database URL: http://proteininformationresource.org/pirwww/iprolink/eFIP.shtml
format Online
Article
Text
id pubmed-3514748
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35147482012-12-05 The eFIP system for text mining of protein interaction networks of phosphorylated proteins Tudor, Catalina O. Arighi, Cecilia N. Wang, Qinghua Wu, Cathy H. Vijay-Shanker, K. Database (Oxford) Original Articles Protein phosphorylation is a central regulatory mechanism in signal transduction involved in most biological processes. Phosphorylation of a protein may lead to activation or repression of its activity, alternative subcellular location and interaction with different binding partners. Extracting this type of information from scientific literature is critical for connecting phosphorylated proteins with kinases and interaction partners, along with their functional outcomes, for knowledge discovery from phosphorylation protein networks. We have developed the Extracting Functional Impact of Phosphorylation (eFIP) text mining system, which combines several natural language processing techniques to find relevant abstracts mentioning phosphorylation of a given protein together with indications of protein–protein interactions (PPIs) and potential evidences for impact of phosphorylation on the PPIs. eFIP integrates our previously developed tools, Extracting Gene Related ABstracts (eGRAB) for document retrieval and name disambiguation, Rule-based LIterature Mining System (RLIMS-P) for Protein Phosphorylation for extraction of phosphorylation information, a PPI module to detect PPIs involving phosphorylated proteins and an impact module for relation extraction. The text mining system has been integrated into the curation workflow of the Protein Ontology (PRO) to capture knowledge about phosphorylated proteins. The eFIP web interface accepts gene/protein names or identifiers, or PubMed identifiers as input, and displays results as a ranked list of abstracts with sentence evidence and summary table, which can be exported in a spreadsheet upon result validation. As a participant in the BioCreative-2012 Interactive Text Mining track, the performance of eFIP was evaluated on document retrieval (F-measures of 78–100%), sentence-level information extraction (F-measures of 70–80%) and document ranking (normalized discounted cumulative gain measures of 93–100% and mean average precision of 0.86). The utility and usability of the eFIP web interface were also evaluated during the BioCreative Workshop. The use of the eFIP interface provided a significant speed-up (∼2.5-fold) for time to completion of the curation task. Additionally, eFIP significantly simplifies the task of finding relevant articles on PPI involving phosphorylated forms of a given protein. Database URL: http://proteininformationresource.org/pirwww/iprolink/eFIP.shtml Oxford University Press 2012-12-05 /pmc/articles/PMC3514748/ /pubmed/23221174 http://dx.doi.org/10.1093/database/bas044 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.
spellingShingle Original Articles
Tudor, Catalina O.
Arighi, Cecilia N.
Wang, Qinghua
Wu, Cathy H.
Vijay-Shanker, K.
The eFIP system for text mining of protein interaction networks of phosphorylated proteins
title The eFIP system for text mining of protein interaction networks of phosphorylated proteins
title_full The eFIP system for text mining of protein interaction networks of phosphorylated proteins
title_fullStr The eFIP system for text mining of protein interaction networks of phosphorylated proteins
title_full_unstemmed The eFIP system for text mining of protein interaction networks of phosphorylated proteins
title_short The eFIP system for text mining of protein interaction networks of phosphorylated proteins
title_sort efip system for text mining of protein interaction networks of phosphorylated proteins
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3514748/
https://www.ncbi.nlm.nih.gov/pubmed/23221174
http://dx.doi.org/10.1093/database/bas044
work_keys_str_mv AT tudorcatalinao theefipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT arighicecilian theefipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT wangqinghua theefipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT wucathyh theefipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT vijayshankerk theefipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT tudorcatalinao efipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT arighicecilian efipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT wangqinghua efipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT wucathyh efipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins
AT vijayshankerk efipsystemfortextminingofproteininteractionnetworksofphosphorylatedproteins