Cargando…

Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature

Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promisi...

Descripción completa

Detalles Bibliográficos
Autores principales: Murugesan, Gurusamy, Abdulkadhar, Sabenabanu, Natarajan, Jeyakumar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5669485/
https://www.ncbi.nlm.nih.gov/pubmed/29099838
http://dx.doi.org/10.1371/journal.pone.0187379
_version_ 1783275853419905024
author Murugesan, Gurusamy
Abdulkadhar, Sabenabanu
Natarajan, Jeyakumar
author_facet Murugesan, Gurusamy
Abdulkadhar, Sabenabanu
Natarajan, Jeyakumar
author_sort Murugesan, Gurusamy
collection PubMed
description Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.
format Online
Article
Text
id pubmed-5669485
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-56694852017-11-17 Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature Murugesan, Gurusamy Abdulkadhar, Sabenabanu Natarajan, Jeyakumar PLoS One Research Article Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. Public Library of Science 2017-11-03 /pmc/articles/PMC5669485/ /pubmed/29099838 http://dx.doi.org/10.1371/journal.pone.0187379 Text en © 2017 Murugesan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Murugesan, Gurusamy
Abdulkadhar, Sabenabanu
Natarajan, Jeyakumar
Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
title Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
title_full Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
title_fullStr Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
title_full_unstemmed Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
title_short Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
title_sort distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5669485/
https://www.ncbi.nlm.nih.gov/pubmed/29099838
http://dx.doi.org/10.1371/journal.pone.0187379
work_keys_str_mv AT murugesangurusamy distributedsmoothedtreekernelforproteinproteininteractionextractionfromthebiomedicalliterature
AT abdulkadharsabenabanu distributedsmoothedtreekernelforproteinproteininteractionextractionfromthebiomedicalliterature
AT natarajanjeyakumar distributedsmoothedtreekernelforproteinproteininteractionextractionfromthebiomedicalliterature