Cargando…

On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach

BACKGROUND: Most previous Protein Protein Interaction (PPI) studies evaluated their algorithms' performance based on "per-instance" precision and recall, in which the instances of an interaction relation were evaluated independently. However, we argue that this standard evaluation met...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Junkyu, Kim, Seongsoon, Lee, Sunwon, Lee, Kyubum, Kang, Jaewoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3618211/
https://www.ncbi.nlm.nih.gov/pubmed/23566263
http://dx.doi.org/10.1186/1472-6947-13-S1-S7
_version_ 1782265377811595264
author Lee, Junkyu
Kim, Seongsoon
Lee, Sunwon
Lee, Kyubum
Kang, Jaewoo
author_facet Lee, Junkyu
Kim, Seongsoon
Lee, Sunwon
Lee, Kyubum
Kang, Jaewoo
author_sort Lee, Junkyu
collection PubMed
description BACKGROUND: Most previous Protein Protein Interaction (PPI) studies evaluated their algorithms' performance based on "per-instance" precision and recall, in which the instances of an interaction relation were evaluated independently. However, we argue that this standard evaluation method should be revisited. In a large corpus, the same relation can be described in various different forms and, in practice, correctly identifying not all but a small subset of them would often suffice to detect the given interaction. METHODS: In this regard, we propose a more pragmatic "per-relation" basis performance evaluation method instead of the conventional per-instance basis method. In the per-relation basis method, only a subset of a relation's instances needs to be correctly identified to make the relation positive. In this work, we also introduce a new high-precision rule-based PPI extraction algorithm. While virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall, in many realistic scenarios involving large corpora, one can benefit more from a high-precision algorithm than a high-recall counterpart. RESULTS: We show that our algorithm not only achieves better per-relation performance than previous solutions but also serves as a good complement to the existing PPI extraction tools. Our algorithm improves the performance of the existing tools through simple pipelining. CONCLUSION: The significance of this research can be found in that this research brought new perspective to the performance evaluation of PPI extraction studies, which we believe is more important in practice than existing evaluation criteria. Given the new evaluation perspective, we also showed the importance of a high-precision extraction tool and validated the efficacy of our rule-based system as the high-precision tool candidate.
format Online
Article
Text
id pubmed-3618211
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36182112013-04-10 On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach Lee, Junkyu Kim, Seongsoon Lee, Sunwon Lee, Kyubum Kang, Jaewoo BMC Med Inform Decis Mak Proceedings BACKGROUND: Most previous Protein Protein Interaction (PPI) studies evaluated their algorithms' performance based on "per-instance" precision and recall, in which the instances of an interaction relation were evaluated independently. However, we argue that this standard evaluation method should be revisited. In a large corpus, the same relation can be described in various different forms and, in practice, correctly identifying not all but a small subset of them would often suffice to detect the given interaction. METHODS: In this regard, we propose a more pragmatic "per-relation" basis performance evaluation method instead of the conventional per-instance basis method. In the per-relation basis method, only a subset of a relation's instances needs to be correctly identified to make the relation positive. In this work, we also introduce a new high-precision rule-based PPI extraction algorithm. While virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall, in many realistic scenarios involving large corpora, one can benefit more from a high-precision algorithm than a high-recall counterpart. RESULTS: We show that our algorithm not only achieves better per-relation performance than previous solutions but also serves as a good complement to the existing PPI extraction tools. Our algorithm improves the performance of the existing tools through simple pipelining. CONCLUSION: The significance of this research can be found in that this research brought new perspective to the performance evaluation of PPI extraction studies, which we believe is more important in practice than existing evaluation criteria. Given the new evaluation perspective, we also showed the importance of a high-precision extraction tool and validated the efficacy of our rule-based system as the high-precision tool candidate. BioMed Central 2013-04-05 /pmc/articles/PMC3618211/ /pubmed/23566263 http://dx.doi.org/10.1186/1472-6947-13-S1-S7 Text en Copyright © 2013 Lee et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Lee, Junkyu
Kim, Seongsoon
Lee, Sunwon
Lee, Kyubum
Kang, Jaewoo
On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach
title On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach
title_full On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach
title_fullStr On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach
title_full_unstemmed On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach
title_short On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach
title_sort on the efficacy of per-relation basis performance evaluation for ppi extraction and a high-precision rule-based approach
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3618211/
https://www.ncbi.nlm.nih.gov/pubmed/23566263
http://dx.doi.org/10.1186/1472-6947-13-S1-S7
work_keys_str_mv AT leejunkyu ontheefficacyofperrelationbasisperformanceevaluationforppiextractionandahighprecisionrulebasedapproach
AT kimseongsoon ontheefficacyofperrelationbasisperformanceevaluationforppiextractionandahighprecisionrulebasedapproach
AT leesunwon ontheefficacyofperrelationbasisperformanceevaluationforppiextractionandahighprecisionrulebasedapproach
AT leekyubum ontheefficacyofperrelationbasisperformanceevaluationforppiextractionandahighprecisionrulebasedapproach
AT kangjaewoo ontheefficacyofperrelationbasisperformanceevaluationforppiextractionandahighprecisionrulebasedapproach