Cargando…

Text-mining-assisted biocuration workflows in Argo

Biocuration activities have been broadly categorized into the selection of relevant documents, the annotation of biological concepts of interest and identification of interactions between the concepts. Text mining has been shown to have a potential to significantly reduce the effort of biocurators i...

Descripción completa

Detalles Bibliográficos
Autores principales: Rak, Rafal, Batista-Navarro, Riza Theresa, Rowley, Andrew, Carter, Jacob, Ananiadou, Sophia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103424/
https://www.ncbi.nlm.nih.gov/pubmed/25037308
http://dx.doi.org/10.1093/database/bau070
_version_ 1782327141987254272
author Rak, Rafal
Batista-Navarro, Riza Theresa
Rowley, Andrew
Carter, Jacob
Ananiadou, Sophia
author_facet Rak, Rafal
Batista-Navarro, Riza Theresa
Rowley, Andrew
Carter, Jacob
Ananiadou, Sophia
author_sort Rak, Rafal
collection PubMed
description Biocuration activities have been broadly categorized into the selection of relevant documents, the annotation of biological concepts of interest and identification of interactions between the concepts. Text mining has been shown to have a potential to significantly reduce the effort of biocurators in all the three activities, and various semi-automatic methodologies have been integrated into curation pipelines to support them. We investigate the suitability of Argo, a workbench for building text-mining solutions with the use of a rich graphical user interface, for the process of biocuration. Central to Argo are customizable workflows that users compose by arranging available elementary analytics to form task-specific processing units. A built-in manual annotation editor is the single most used biocuration tool of the workbench, as it allows users to create annotations directly in text, as well as modify or delete annotations created by automatic processing components. Apart from syntactic and semantic analytics, the ever-growing library of components includes several data readers and consumers that support well-established as well as emerging data interchange formats such as XMI, RDF and BioC, which facilitate the interoperability of Argo with other platforms or resources. To validate the suitability of Argo for curation activities, we participated in the BioCreative IV challenge whose purpose was to evaluate Web-based systems addressing user-defined biocuration tasks. Argo proved to have the edge over other systems in terms of flexibility of defining biocuration tasks. As expected, the versatility of the workbench inevitably lengthened the time the curators spent on learning the system before taking on the task, which may have affected the usability of Argo. The participation in the challenge gave us an opportunity to gather valuable feedback and identify areas of improvement, some of which have already been introduced. Database URL: http://argo.nactem.ac.uk
format Online
Article
Text
id pubmed-4103424
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-41034242014-07-21 Text-mining-assisted biocuration workflows in Argo Rak, Rafal Batista-Navarro, Riza Theresa Rowley, Andrew Carter, Jacob Ananiadou, Sophia Database (Oxford) Original Article Biocuration activities have been broadly categorized into the selection of relevant documents, the annotation of biological concepts of interest and identification of interactions between the concepts. Text mining has been shown to have a potential to significantly reduce the effort of biocurators in all the three activities, and various semi-automatic methodologies have been integrated into curation pipelines to support them. We investigate the suitability of Argo, a workbench for building text-mining solutions with the use of a rich graphical user interface, for the process of biocuration. Central to Argo are customizable workflows that users compose by arranging available elementary analytics to form task-specific processing units. A built-in manual annotation editor is the single most used biocuration tool of the workbench, as it allows users to create annotations directly in text, as well as modify or delete annotations created by automatic processing components. Apart from syntactic and semantic analytics, the ever-growing library of components includes several data readers and consumers that support well-established as well as emerging data interchange formats such as XMI, RDF and BioC, which facilitate the interoperability of Argo with other platforms or resources. To validate the suitability of Argo for curation activities, we participated in the BioCreative IV challenge whose purpose was to evaluate Web-based systems addressing user-defined biocuration tasks. Argo proved to have the edge over other systems in terms of flexibility of defining biocuration tasks. As expected, the versatility of the workbench inevitably lengthened the time the curators spent on learning the system before taking on the task, which may have affected the usability of Argo. The participation in the challenge gave us an opportunity to gather valuable feedback and identify areas of improvement, some of which have already been introduced. Database URL: http://argo.nactem.ac.uk Oxford University Press 2014-07-18 /pmc/articles/PMC4103424/ /pubmed/25037308 http://dx.doi.org/10.1093/database/bau070 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Rak, Rafal
Batista-Navarro, Riza Theresa
Rowley, Andrew
Carter, Jacob
Ananiadou, Sophia
Text-mining-assisted biocuration workflows in Argo
title Text-mining-assisted biocuration workflows in Argo
title_full Text-mining-assisted biocuration workflows in Argo
title_fullStr Text-mining-assisted biocuration workflows in Argo
title_full_unstemmed Text-mining-assisted biocuration workflows in Argo
title_short Text-mining-assisted biocuration workflows in Argo
title_sort text-mining-assisted biocuration workflows in argo
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103424/
https://www.ncbi.nlm.nih.gov/pubmed/25037308
http://dx.doi.org/10.1093/database/bau070
work_keys_str_mv AT rakrafal textminingassistedbiocurationworkflowsinargo
AT batistanavarrorizatheresa textminingassistedbiocurationworkflowsinargo
AT rowleyandrew textminingassistedbiocurationworkflowsinargo
AT carterjacob textminingassistedbiocurationworkflowsinargo
AT ananiadousophia textminingassistedbiocurationworkflowsinargo