Cargando…

Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords

For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a spec...

Descripción completa

Detalles Bibliográficos
Autores principales:	Koyabu, Shun, Phan, Thi Thanh Thuy, Ohkawa, Takenao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi Publishing Corporation 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4689882/ https://www.ncbi.nlm.nih.gov/pubmed/26783534 http://dx.doi.org/10.1155/2015/928531

_version_	1782406911871680512
author	Koyabu, Shun Phan, Thi Thanh Thuy Ohkawa, Takenao
author_facet	Koyabu, Shun Phan, Thi Thanh Thuy Ohkawa, Takenao
author_sort	Koyabu, Shun
collection	PubMed
description	For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as “bind” or “interact” plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction.
format	Online Article Text
id	pubmed-4689882
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Hindawi Publishing Corporation
record_format	MEDLINE/PubMed
spelling	pubmed-46898822016-01-18 Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords Koyabu, Shun Phan, Thi Thanh Thuy Ohkawa, Takenao Biomed Res Int Research Article For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as “bind” or “interact” plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction. Hindawi Publishing Corporation 2015 2015-12-10 /pmc/articles/PMC4689882/ /pubmed/26783534 http://dx.doi.org/10.1155/2015/928531 Text en Copyright © 2015 Shun Koyabu et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Koyabu, Shun Phan, Thi Thanh Thuy Ohkawa, Takenao Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
title	Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
title_full	Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
title_fullStr	Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
title_full_unstemmed	Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
title_short	Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
title_sort	extraction of protein-protein interaction from scientific articles by predicting dominant keywords
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4689882/ https://www.ncbi.nlm.nih.gov/pubmed/26783534 http://dx.doi.org/10.1155/2015/928531
work_keys_str_mv	AT koyabushun extractionofproteinproteininteractionfromscientificarticlesbypredictingdominantkeywords AT phanthithanhthuy extractionofproteinproteininteractionfromscientificarticlesbypredictingdominantkeywords AT ohkawatakenao extractionofproteinproteininteractionfromscientificarticlesbypredictingdominantkeywords

Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords

Ejemplares similares