Cargando…
An overview of the BioCreative 2012 Workshop Track III: interactive text mining task
In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3625048/ https://www.ncbi.nlm.nih.gov/pubmed/23327936 http://dx.doi.org/10.1093/database/bas056 |
_version_ | 1782266067119243264 |
---|---|
author | Arighi, Cecilia N. Carterette, Ben Cohen, K. Bretonnel Krallinger, Martin Wilbur, W. John Fey, Petra Dodson, Robert Cooper, Laurel Van Slyke, Ceri E. Dahdul, Wasila Mabee, Paula Li, Donghui Harris, Bethany Gillespie, Marc Jimenez, Silvia Roberts, Phoebe Matthews, Lisa Becker, Kevin Drabkin, Harold Bello, Susan Licata, Luana Chatr-aryamontri, Andrew Schaeffer, Mary L. Park, Julie Haendel, Melissa Van Auken, Kimberly Li, Yuling Chan, Juancarlos Muller, Hans-Michael Cui, Hong Balhoff, James P. Chi-Yang Wu, Johnny Lu, Zhiyong Wei, Chih-Hsuan Tudor, Catalina O. Raja, Kalpana Subramani, Suresh Natarajan, Jeyakumar Cejuela, Juan Miguel Dubey, Pratibha Wu, Cathy |
author_facet | Arighi, Cecilia N. Carterette, Ben Cohen, K. Bretonnel Krallinger, Martin Wilbur, W. John Fey, Petra Dodson, Robert Cooper, Laurel Van Slyke, Ceri E. Dahdul, Wasila Mabee, Paula Li, Donghui Harris, Bethany Gillespie, Marc Jimenez, Silvia Roberts, Phoebe Matthews, Lisa Becker, Kevin Drabkin, Harold Bello, Susan Licata, Luana Chatr-aryamontri, Andrew Schaeffer, Mary L. Park, Julie Haendel, Melissa Van Auken, Kimberly Li, Yuling Chan, Juancarlos Muller, Hans-Michael Cui, Hong Balhoff, James P. Chi-Yang Wu, Johnny Lu, Zhiyong Wei, Chih-Hsuan Tudor, Catalina O. Raja, Kalpana Subramani, Suresh Natarajan, Jeyakumar Cejuela, Juan Miguel Dubey, Pratibha Wu, Cathy |
author_sort | Arighi, Cecilia N. |
collection | PubMed |
description | In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators’ overall experience of a system, regardless of the system’s high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV. |
format | Online Article Text |
id | pubmed-3625048 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-36250482013-04-15 An overview of the BioCreative 2012 Workshop Track III: interactive text mining task Arighi, Cecilia N. Carterette, Ben Cohen, K. Bretonnel Krallinger, Martin Wilbur, W. John Fey, Petra Dodson, Robert Cooper, Laurel Van Slyke, Ceri E. Dahdul, Wasila Mabee, Paula Li, Donghui Harris, Bethany Gillespie, Marc Jimenez, Silvia Roberts, Phoebe Matthews, Lisa Becker, Kevin Drabkin, Harold Bello, Susan Licata, Luana Chatr-aryamontri, Andrew Schaeffer, Mary L. Park, Julie Haendel, Melissa Van Auken, Kimberly Li, Yuling Chan, Juancarlos Muller, Hans-Michael Cui, Hong Balhoff, James P. Chi-Yang Wu, Johnny Lu, Zhiyong Wei, Chih-Hsuan Tudor, Catalina O. Raja, Kalpana Subramani, Suresh Natarajan, Jeyakumar Cejuela, Juan Miguel Dubey, Pratibha Wu, Cathy Database (Oxford) Original Article In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators’ overall experience of a system, regardless of the system’s high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV. Oxford University Press 2013-01-16 /pmc/articles/PMC3625048/ /pubmed/23327936 http://dx.doi.org/10.1093/database/bas056 Text en Published by Oxford University Press on behalf of US Government 2013. |
spellingShingle | Original Article Arighi, Cecilia N. Carterette, Ben Cohen, K. Bretonnel Krallinger, Martin Wilbur, W. John Fey, Petra Dodson, Robert Cooper, Laurel Van Slyke, Ceri E. Dahdul, Wasila Mabee, Paula Li, Donghui Harris, Bethany Gillespie, Marc Jimenez, Silvia Roberts, Phoebe Matthews, Lisa Becker, Kevin Drabkin, Harold Bello, Susan Licata, Luana Chatr-aryamontri, Andrew Schaeffer, Mary L. Park, Julie Haendel, Melissa Van Auken, Kimberly Li, Yuling Chan, Juancarlos Muller, Hans-Michael Cui, Hong Balhoff, James P. Chi-Yang Wu, Johnny Lu, Zhiyong Wei, Chih-Hsuan Tudor, Catalina O. Raja, Kalpana Subramani, Suresh Natarajan, Jeyakumar Cejuela, Juan Miguel Dubey, Pratibha Wu, Cathy An overview of the BioCreative 2012 Workshop Track III: interactive text mining task |
title | An overview of the BioCreative 2012 Workshop Track III: interactive text mining task |
title_full | An overview of the BioCreative 2012 Workshop Track III: interactive text mining task |
title_fullStr | An overview of the BioCreative 2012 Workshop Track III: interactive text mining task |
title_full_unstemmed | An overview of the BioCreative 2012 Workshop Track III: interactive text mining task |
title_short | An overview of the BioCreative 2012 Workshop Track III: interactive text mining task |
title_sort | overview of the biocreative 2012 workshop track iii: interactive text mining task |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3625048/ https://www.ncbi.nlm.nih.gov/pubmed/23327936 http://dx.doi.org/10.1093/database/bas056 |
work_keys_str_mv | AT arighicecilian anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT carteretteben anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cohenkbretonnel anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT krallingermartin anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT wilburwjohn anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT feypetra anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT dodsonrobert anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cooperlaurel anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT vanslykecerie anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT dahdulwasila anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT mabeepaula anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT lidonghui anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT harrisbethany anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT gillespiemarc anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT jimenezsilvia anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT robertsphoebe anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT matthewslisa anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT beckerkevin anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT drabkinharold anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT bellosusan anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT licataluana anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT chatraryamontriandrew anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT schaeffermaryl anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT parkjulie anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT haendelmelissa anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT vanaukenkimberly anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT liyuling anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT chanjuancarlos anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT mullerhansmichael anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cuihong anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT balhoffjamesp anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT chiyangwujohnny anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT luzhiyong anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT weichihhsuan anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT tudorcatalinao anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT rajakalpana anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT subramanisuresh anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT natarajanjeyakumar anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cejuelajuanmiguel anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT dubeypratibha anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT wucathy anoverviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT arighicecilian overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT carteretteben overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cohenkbretonnel overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT krallingermartin overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT wilburwjohn overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT feypetra overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT dodsonrobert overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cooperlaurel overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT vanslykecerie overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT dahdulwasila overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT mabeepaula overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT lidonghui overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT harrisbethany overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT gillespiemarc overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT jimenezsilvia overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT robertsphoebe overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT matthewslisa overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT beckerkevin overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT drabkinharold overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT bellosusan overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT licataluana overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT chatraryamontriandrew overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT schaeffermaryl overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT parkjulie overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT haendelmelissa overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT vanaukenkimberly overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT liyuling overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT chanjuancarlos overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT mullerhansmichael overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cuihong overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT balhoffjamesp overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT chiyangwujohnny overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT luzhiyong overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT weichihhsuan overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT tudorcatalinao overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT rajakalpana overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT subramanisuresh overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT natarajanjeyakumar overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT cejuelajuanmiguel overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT dubeypratibha overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask AT wucathy overviewofthebiocreative2012workshoptrackiiiinteractivetextminingtask |