Cargando…

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Krallinger, Martin, Vazquez, Miguel, Leitner, Florian, Salgado, David, Chatr-aryamontri, Andrew, Winter, Andrew, Perfetto, Livia, Briganti, Leonardo, Licata, Luana, Iannuccelli, Marta, Castagnoli, Luisa, Cesareni, Gianni, Tyers, Mike, Schneider, Gerold, Rinaldi, Fabio, Leaman, Robert, Gonzalez, Graciela, Matos, Sergio, Kim, Sun, Wilbur, W John, Rocha, Luis, Shatkay, Hagit, Tendulkar, Ashish V, Agarwal, Shashank, Liu, Feifan, Wang, Xinglong, Rak, Rafal, Noto, Keith, Elkan, Charles, Lu, Zhiyong, Dogan, Rezarta Islamaj, Fontaine, Jean-Fred, Andrade-Navarro, Miguel A, Valencia, Alfonso
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3269938/
https://www.ncbi.nlm.nih.gov/pubmed/22151929
http://dx.doi.org/10.1186/1471-2105-12-S8-S3
_version_ 1782222523286421504
author Krallinger, Martin
Vazquez, Miguel
Leitner, Florian
Salgado, David
Chatr-aryamontri, Andrew
Winter, Andrew
Perfetto, Livia
Briganti, Leonardo
Licata, Luana
Iannuccelli, Marta
Castagnoli, Luisa
Cesareni, Gianni
Tyers, Mike
Schneider, Gerold
Rinaldi, Fabio
Leaman, Robert
Gonzalez, Graciela
Matos, Sergio
Kim, Sun
Wilbur, W John
Rocha, Luis
Shatkay, Hagit
Tendulkar, Ashish V
Agarwal, Shashank
Liu, Feifan
Wang, Xinglong
Rak, Rafal
Noto, Keith
Elkan, Charles
Lu, Zhiyong
Dogan, Rezarta Islamaj
Fontaine, Jean-Fred
Andrade-Navarro, Miguel A
Valencia, Alfonso
author_facet Krallinger, Martin
Vazquez, Miguel
Leitner, Florian
Salgado, David
Chatr-aryamontri, Andrew
Winter, Andrew
Perfetto, Livia
Briganti, Leonardo
Licata, Luana
Iannuccelli, Marta
Castagnoli, Luisa
Cesareni, Gianni
Tyers, Mike
Schneider, Gerold
Rinaldi, Fabio
Leaman, Robert
Gonzalez, Graciela
Matos, Sergio
Kim, Sun
Wilbur, W John
Rocha, Luis
Shatkay, Hagit
Tendulkar, Ashish V
Agarwal, Shashank
Liu, Feifan
Wang, Xinglong
Rak, Rafal
Noto, Keith
Elkan, Charles
Lu, Zhiyong
Dogan, Rezarta Islamaj
Fontaine, Jean-Fred
Andrade-Navarro, Miguel A
Valencia, Alfonso
author_sort Krallinger, Martin
collection PubMed
description BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. RESULTS: A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows.
format Online
Article
Text
id pubmed-3269938
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32699382012-02-02 The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text Krallinger, Martin Vazquez, Miguel Leitner, Florian Salgado, David Chatr-aryamontri, Andrew Winter, Andrew Perfetto, Livia Briganti, Leonardo Licata, Luana Iannuccelli, Marta Castagnoli, Luisa Cesareni, Gianni Tyers, Mike Schneider, Gerold Rinaldi, Fabio Leaman, Robert Gonzalez, Graciela Matos, Sergio Kim, Sun Wilbur, W John Rocha, Luis Shatkay, Hagit Tendulkar, Ashish V Agarwal, Shashank Liu, Feifan Wang, Xinglong Rak, Rafal Noto, Keith Elkan, Charles Lu, Zhiyong Dogan, Rezarta Islamaj Fontaine, Jean-Fred Andrade-Navarro, Miguel A Valencia, Alfonso BMC Bioinformatics Research BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. RESULTS: A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows. BioMed Central 2011-10-03 /pmc/articles/PMC3269938/ /pubmed/22151929 http://dx.doi.org/10.1186/1471-2105-12-S8-S3 Text en Copyright ©2011 Krallinger et al. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Krallinger, Martin
Vazquez, Miguel
Leitner, Florian
Salgado, David
Chatr-aryamontri, Andrew
Winter, Andrew
Perfetto, Livia
Briganti, Leonardo
Licata, Luana
Iannuccelli, Marta
Castagnoli, Luisa
Cesareni, Gianni
Tyers, Mike
Schneider, Gerold
Rinaldi, Fabio
Leaman, Robert
Gonzalez, Graciela
Matos, Sergio
Kim, Sun
Wilbur, W John
Rocha, Luis
Shatkay, Hagit
Tendulkar, Ashish V
Agarwal, Shashank
Liu, Feifan
Wang, Xinglong
Rak, Rafal
Noto, Keith
Elkan, Charles
Lu, Zhiyong
Dogan, Rezarta Islamaj
Fontaine, Jean-Fred
Andrade-Navarro, Miguel A
Valencia, Alfonso
The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
title The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
title_full The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
title_fullStr The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
title_full_unstemmed The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
title_short The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
title_sort protein-protein interaction tasks of biocreative iii: classification/ranking of articles and linking bio-ontology concepts to full text
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3269938/
https://www.ncbi.nlm.nih.gov/pubmed/22151929
http://dx.doi.org/10.1186/1471-2105-12-S8-S3
work_keys_str_mv AT krallingermartin theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT vazquezmiguel theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT leitnerflorian theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT salgadodavid theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT chatraryamontriandrew theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT winterandrew theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT perfettolivia theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT brigantileonardo theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT licataluana theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT iannuccellimarta theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT castagnoliluisa theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT cesarenigianni theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT tyersmike theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT schneidergerold theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT rinaldifabio theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT leamanrobert theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT gonzalezgraciela theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT matossergio theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT kimsun theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT wilburwjohn theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT rochaluis theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT shatkayhagit theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT tendulkarashishv theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT agarwalshashank theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT liufeifan theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT wangxinglong theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT rakrafal theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT notokeith theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT elkancharles theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT luzhiyong theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT doganrezartaislamaj theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT fontainejeanfred theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT andradenavarromiguela theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT valenciaalfonso theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT krallingermartin proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT vazquezmiguel proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT leitnerflorian proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT salgadodavid proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT chatraryamontriandrew proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT winterandrew proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT perfettolivia proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT brigantileonardo proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT licataluana proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT iannuccellimarta proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT castagnoliluisa proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT cesarenigianni proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT tyersmike proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT schneidergerold proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT rinaldifabio proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT leamanrobert proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT gonzalezgraciela proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT matossergio proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT kimsun proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT wilburwjohn proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT rochaluis proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT shatkayhagit proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT tendulkarashishv proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT agarwalshashank proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT liufeifan proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT wangxinglong proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT rakrafal proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT notokeith proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT elkancharles proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT luzhiyong proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT doganrezartaislamaj proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT fontainejeanfred proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT andradenavarromiguela proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext
AT valenciaalfonso proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext