Cargando…
Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature
Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promisi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5669485/ https://www.ncbi.nlm.nih.gov/pubmed/29099838 http://dx.doi.org/10.1371/journal.pone.0187379 |
_version_ | 1783275853419905024 |
---|---|
author | Murugesan, Gurusamy Abdulkadhar, Sabenabanu Natarajan, Jeyakumar |
author_facet | Murugesan, Gurusamy Abdulkadhar, Sabenabanu Natarajan, Jeyakumar |
author_sort | Murugesan, Gurusamy |
collection | PubMed |
description | Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. |
format | Online Article Text |
id | pubmed-5669485 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-56694852017-11-17 Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature Murugesan, Gurusamy Abdulkadhar, Sabenabanu Natarajan, Jeyakumar PLoS One Research Article Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. Public Library of Science 2017-11-03 /pmc/articles/PMC5669485/ /pubmed/29099838 http://dx.doi.org/10.1371/journal.pone.0187379 Text en © 2017 Murugesan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Murugesan, Gurusamy Abdulkadhar, Sabenabanu Natarajan, Jeyakumar Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
title | Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
title_full | Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
title_fullStr | Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
title_full_unstemmed | Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
title_short | Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
title_sort | distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5669485/ https://www.ncbi.nlm.nih.gov/pubmed/29099838 http://dx.doi.org/10.1371/journal.pone.0187379 |
work_keys_str_mv | AT murugesangurusamy distributedsmoothedtreekernelforproteinproteininteractionextractionfromthebiomedicalliterature AT abdulkadharsabenabanu distributedsmoothedtreekernelforproteinproteininteractionextractionfromthebiomedicalliterature AT natarajanjeyakumar distributedsmoothedtreekernelforproteinproteininteractionextractionfromthebiomedicalliterature |