Cargando…

Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions

BACKGROUND: Protein-protein interaction (PPI) is an important biomedical phenomenon. Automatically detecting PPI-relevant articles and identifying methods that are used to study PPI are important text mining tasks. In this study, we have explored domain independent features to develop two open sourc...

Descripción completa

Detalles Bibliográficos
Autores principales: Agarwal, Shashank, Liu, Feifan, Yu, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3269933/
https://www.ncbi.nlm.nih.gov/pubmed/22151701
http://dx.doi.org/10.1186/1471-2105-12-S8-S10
_version_ 1782222522126696448
author Agarwal, Shashank
Liu, Feifan
Yu, Hong
author_facet Agarwal, Shashank
Liu, Feifan
Yu, Hong
author_sort Agarwal, Shashank
collection PubMed
description BACKGROUND: Protein-protein interaction (PPI) is an important biomedical phenomenon. Automatically detecting PPI-relevant articles and identifying methods that are used to study PPI are important text mining tasks. In this study, we have explored domain independent features to develop two open source machine learning frameworks. One performs binary classification to determine whether the given article is PPI relevant or not, named “Simple Classifier”, and the other one maps the PPI relevant articles with corresponding interaction method nodes in a standardized PSI-MI (Proteomics Standards Initiative-Molecular Interactions) ontology, named “OntoNorm”. RESULTS: We evaluated our system in the context of BioCreative challenge competition using the standardized data set. Our systems are amongst the top systems reported by the organizers, attaining 60.8% F1-score for identifying relevant documents, and 52.3% F1-score for mapping articles to interaction method ontology. CONCLUSION: Our results show that domain-independent machine learning frameworks can perform competitively well at the tasks of detecting PPI relevant articles and identifying the methods that were used to study the interaction in such articles. AVAILABILITY: Simple Classifier is available at http://sourceforge.net/p/simpleclassify/home/ and OntoNorm at http://sourceforge.net/p/ontonorm/home/.
format Online
Article
Text
id pubmed-3269933
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32699332012-02-02 Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions Agarwal, Shashank Liu, Feifan Yu, Hong BMC Bioinformatics Research BACKGROUND: Protein-protein interaction (PPI) is an important biomedical phenomenon. Automatically detecting PPI-relevant articles and identifying methods that are used to study PPI are important text mining tasks. In this study, we have explored domain independent features to develop two open source machine learning frameworks. One performs binary classification to determine whether the given article is PPI relevant or not, named “Simple Classifier”, and the other one maps the PPI relevant articles with corresponding interaction method nodes in a standardized PSI-MI (Proteomics Standards Initiative-Molecular Interactions) ontology, named “OntoNorm”. RESULTS: We evaluated our system in the context of BioCreative challenge competition using the standardized data set. Our systems are amongst the top systems reported by the organizers, attaining 60.8% F1-score for identifying relevant documents, and 52.3% F1-score for mapping articles to interaction method ontology. CONCLUSION: Our results show that domain-independent machine learning frameworks can perform competitively well at the tasks of detecting PPI relevant articles and identifying the methods that were used to study the interaction in such articles. AVAILABILITY: Simple Classifier is available at http://sourceforge.net/p/simpleclassify/home/ and OntoNorm at http://sourceforge.net/p/ontonorm/home/. BioMed Central 2011-10-03 /pmc/articles/PMC3269933/ /pubmed/22151701 http://dx.doi.org/10.1186/1471-2105-12-S8-S10 Text en Copyright ©2011 Agarwal et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Agarwal, Shashank
Liu, Feifan
Yu, Hong
Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
title Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
title_full Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
title_fullStr Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
title_full_unstemmed Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
title_short Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
title_sort simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3269933/
https://www.ncbi.nlm.nih.gov/pubmed/22151701
http://dx.doi.org/10.1186/1471-2105-12-S8-S10
work_keys_str_mv AT agarwalshashank simpleandefficientmachinelearningframeworksforidentifyingproteinproteininteractionrelevantarticlesandexperimentalmethodsusedtostudytheinteractions
AT liufeifan simpleandefficientmachinelearningframeworksforidentifyingproteinproteininteractionrelevantarticlesandexperimentalmethodsusedtostudytheinteractions
AT yuhong simpleandefficientmachinelearningframeworksforidentifyingproteinproteininteractionrelevantarticlesandexperimentalmethodsusedtostudytheinteractions