Cargando…
The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motiv...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3269938/ https://www.ncbi.nlm.nih.gov/pubmed/22151929 http://dx.doi.org/10.1186/1471-2105-12-S8-S3 |
_version_ | 1782222523286421504 |
---|---|
author | Krallinger, Martin Vazquez, Miguel Leitner, Florian Salgado, David Chatr-aryamontri, Andrew Winter, Andrew Perfetto, Livia Briganti, Leonardo Licata, Luana Iannuccelli, Marta Castagnoli, Luisa Cesareni, Gianni Tyers, Mike Schneider, Gerold Rinaldi, Fabio Leaman, Robert Gonzalez, Graciela Matos, Sergio Kim, Sun Wilbur, W John Rocha, Luis Shatkay, Hagit Tendulkar, Ashish V Agarwal, Shashank Liu, Feifan Wang, Xinglong Rak, Rafal Noto, Keith Elkan, Charles Lu, Zhiyong Dogan, Rezarta Islamaj Fontaine, Jean-Fred Andrade-Navarro, Miguel A Valencia, Alfonso |
author_facet | Krallinger, Martin Vazquez, Miguel Leitner, Florian Salgado, David Chatr-aryamontri, Andrew Winter, Andrew Perfetto, Livia Briganti, Leonardo Licata, Luana Iannuccelli, Marta Castagnoli, Luisa Cesareni, Gianni Tyers, Mike Schneider, Gerold Rinaldi, Fabio Leaman, Robert Gonzalez, Graciela Matos, Sergio Kim, Sun Wilbur, W John Rocha, Luis Shatkay, Hagit Tendulkar, Ashish V Agarwal, Shashank Liu, Feifan Wang, Xinglong Rak, Rafal Noto, Keith Elkan, Charles Lu, Zhiyong Dogan, Rezarta Islamaj Fontaine, Jean-Fred Andrade-Navarro, Miguel A Valencia, Alfonso |
author_sort | Krallinger, Martin |
collection | PubMed |
description | BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. RESULTS: A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows. |
format | Online Article Text |
id | pubmed-3269938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32699382012-02-02 The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text Krallinger, Martin Vazquez, Miguel Leitner, Florian Salgado, David Chatr-aryamontri, Andrew Winter, Andrew Perfetto, Livia Briganti, Leonardo Licata, Luana Iannuccelli, Marta Castagnoli, Luisa Cesareni, Gianni Tyers, Mike Schneider, Gerold Rinaldi, Fabio Leaman, Robert Gonzalez, Graciela Matos, Sergio Kim, Sun Wilbur, W John Rocha, Luis Shatkay, Hagit Tendulkar, Ashish V Agarwal, Shashank Liu, Feifan Wang, Xinglong Rak, Rafal Noto, Keith Elkan, Charles Lu, Zhiyong Dogan, Rezarta Islamaj Fontaine, Jean-Fred Andrade-Navarro, Miguel A Valencia, Alfonso BMC Bioinformatics Research BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. RESULTS: A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows. BioMed Central 2011-10-03 /pmc/articles/PMC3269938/ /pubmed/22151929 http://dx.doi.org/10.1186/1471-2105-12-S8-S3 Text en Copyright ©2011 Krallinger et al. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Krallinger, Martin Vazquez, Miguel Leitner, Florian Salgado, David Chatr-aryamontri, Andrew Winter, Andrew Perfetto, Livia Briganti, Leonardo Licata, Luana Iannuccelli, Marta Castagnoli, Luisa Cesareni, Gianni Tyers, Mike Schneider, Gerold Rinaldi, Fabio Leaman, Robert Gonzalez, Graciela Matos, Sergio Kim, Sun Wilbur, W John Rocha, Luis Shatkay, Hagit Tendulkar, Ashish V Agarwal, Shashank Liu, Feifan Wang, Xinglong Rak, Rafal Noto, Keith Elkan, Charles Lu, Zhiyong Dogan, Rezarta Islamaj Fontaine, Jean-Fred Andrade-Navarro, Miguel A Valencia, Alfonso The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text |
title | The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text |
title_full | The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text |
title_fullStr | The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text |
title_full_unstemmed | The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text |
title_short | The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text |
title_sort | protein-protein interaction tasks of biocreative iii: classification/ranking of articles and linking bio-ontology concepts to full text |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3269938/ https://www.ncbi.nlm.nih.gov/pubmed/22151929 http://dx.doi.org/10.1186/1471-2105-12-S8-S3 |
work_keys_str_mv | AT krallingermartin theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT vazquezmiguel theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT leitnerflorian theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT salgadodavid theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT chatraryamontriandrew theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT winterandrew theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT perfettolivia theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT brigantileonardo theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT licataluana theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT iannuccellimarta theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT castagnoliluisa theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT cesarenigianni theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT tyersmike theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT schneidergerold theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT rinaldifabio theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT leamanrobert theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT gonzalezgraciela theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT matossergio theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT kimsun theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT wilburwjohn theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT rochaluis theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT shatkayhagit theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT tendulkarashishv theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT agarwalshashank theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT liufeifan theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT wangxinglong theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT rakrafal theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT notokeith theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT elkancharles theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT luzhiyong theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT doganrezartaislamaj theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT fontainejeanfred theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT andradenavarromiguela theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT valenciaalfonso theproteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT krallingermartin proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT vazquezmiguel proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT leitnerflorian proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT salgadodavid proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT chatraryamontriandrew proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT winterandrew proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT perfettolivia proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT brigantileonardo proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT licataluana proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT iannuccellimarta proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT castagnoliluisa proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT cesarenigianni proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT tyersmike proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT schneidergerold proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT rinaldifabio proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT leamanrobert proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT gonzalezgraciela proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT matossergio proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT kimsun proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT wilburwjohn proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT rochaluis proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT shatkayhagit proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT tendulkarashishv proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT agarwalshashank proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT liufeifan proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT wangxinglong proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT rakrafal proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT notokeith proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT elkancharles proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT luzhiyong proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT doganrezartaislamaj proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT fontainejeanfred proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT andradenavarromiguela proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext AT valenciaalfonso proteinproteininteractiontasksofbiocreativeiiiclassificationrankingofarticlesandlinkingbioontologyconceptstofulltext |