Cargando…

Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase

Biological knowledgebases rely on expert biocuration of the research literature to maintain up-to-date collections of data organized in machine-readable form. To enter information into knowledgebases, curators need to follow three steps: (i) identify papers containing relevant data, a process called...

Descripción completa

Detalles Bibliográficos
Autores principales:	Arnaboldi, Valerio, Raciti, Daniela, Van Auken, Kimberly, Chan, Juancarlos N, Müller, Hans-Michael, Sternberg, Paul W
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2020
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7078066/ https://www.ncbi.nlm.nih.gov/pubmed/32185395 http://dx.doi.org/10.1093/database/baaa006

_version_	1783507538021449728
author	Arnaboldi, Valerio Raciti, Daniela Van Auken, Kimberly Chan, Juancarlos N Müller, Hans-Michael Sternberg, Paul W
author_facet	Arnaboldi, Valerio Raciti, Daniela Van Auken, Kimberly Chan, Juancarlos N Müller, Hans-Michael Sternberg, Paul W
author_sort	Arnaboldi, Valerio
collection	PubMed
description	Biological knowledgebases rely on expert biocuration of the research literature to maintain up-to-date collections of data organized in machine-readable form. To enter information into knowledgebases, curators need to follow three steps: (i) identify papers containing relevant data, a process called triaging; (ii) recognize named entities; and (iii) extract and curate data in accordance with the underlying data models. WormBase (WB), the authoritative repository for research data on Caenorhabditis elegans and other nematodes, uses text mining (TM) to semi-automate its curation pipeline. In addition, WB engages its community, via an Author First Pass (AFP) system, to help recognize entities and classify data types in their recently published papers. In this paper, we present a new WB AFP system that combines TM and AFP into a single application to enhance community curation. The system employs string-searching algorithms and statistical methods (e.g. support vector machines (SVMs)) to extract biological entities and classify data types, and it presents the results to authors in a web form where they validate the extracted information, rather than enter it de novo as the previous form required. With this new system, we lessen the burden for authors, while at the same time receive valuable feedback on the performance of our TM tools. The new user interface also links out to specific structured data submission forms, e.g. for phenotype or expression pattern data, giving the authors the opportunity to contribute a more detailed curation that can be incorporated into WB with minimal curator review. Our approach is generalizable and could be applied to additional knowledgebases that would like to engage their user community in assisting with the curation. In the five months succeeding the launch of the new system, the response rate has been comparable with that of the previous AFP version, but the quality and quantity of the data received has greatly improved.
format	Online Article Text
id	pubmed-7078066
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-70780662020-03-23 Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase Arnaboldi, Valerio Raciti, Daniela Van Auken, Kimberly Chan, Juancarlos N Müller, Hans-Michael Sternberg, Paul W Database (Oxford) Original Article Biological knowledgebases rely on expert biocuration of the research literature to maintain up-to-date collections of data organized in machine-readable form. To enter information into knowledgebases, curators need to follow three steps: (i) identify papers containing relevant data, a process called triaging; (ii) recognize named entities; and (iii) extract and curate data in accordance with the underlying data models. WormBase (WB), the authoritative repository for research data on Caenorhabditis elegans and other nematodes, uses text mining (TM) to semi-automate its curation pipeline. In addition, WB engages its community, via an Author First Pass (AFP) system, to help recognize entities and classify data types in their recently published papers. In this paper, we present a new WB AFP system that combines TM and AFP into a single application to enhance community curation. The system employs string-searching algorithms and statistical methods (e.g. support vector machines (SVMs)) to extract biological entities and classify data types, and it presents the results to authors in a web form where they validate the extracted information, rather than enter it de novo as the previous form required. With this new system, we lessen the burden for authors, while at the same time receive valuable feedback on the performance of our TM tools. The new user interface also links out to specific structured data submission forms, e.g. for phenotype or expression pattern data, giving the authors the opportunity to contribute a more detailed curation that can be incorporated into WB with minimal curator review. Our approach is generalizable and could be applied to additional knowledgebases that would like to engage their user community in assisting with the curation. In the five months succeeding the launch of the new system, the response rate has been comparable with that of the previous AFP version, but the quality and quantity of the data received has greatly improved. Oxford University Press 2020-03-17 /pmc/articles/PMC7078066/ /pubmed/32185395 http://dx.doi.org/10.1093/database/baaa006 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Article Arnaboldi, Valerio Raciti, Daniela Van Auken, Kimberly Chan, Juancarlos N Müller, Hans-Michael Sternberg, Paul W Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase
title	Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase
title_full	Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase
title_fullStr	Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase
title_full_unstemmed	Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase
title_short	Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase
title_sort	text mining meets community curation: a newly designed curation platform to improve author experience and participation at wormbase
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7078066/ https://www.ncbi.nlm.nih.gov/pubmed/32185395 http://dx.doi.org/10.1093/database/baaa006
work_keys_str_mv	AT arnaboldivalerio textminingmeetscommunitycurationanewlydesignedcurationplatformtoimproveauthorexperienceandparticipationatwormbase AT racitidaniela textminingmeetscommunitycurationanewlydesignedcurationplatformtoimproveauthorexperienceandparticipationatwormbase AT vanaukenkimberly textminingmeetscommunitycurationanewlydesignedcurationplatformtoimproveauthorexperienceandparticipationatwormbase AT chanjuancarlosn textminingmeetscommunitycurationanewlydesignedcurationplatformtoimproveauthorexperienceandparticipationatwormbase AT mullerhansmichael textminingmeetscommunitycurationanewlydesignedcurationplatformtoimproveauthorexperienceandparticipationatwormbase AT sternbergpaulw textminingmeetscommunitycurationanewlydesignedcurationplatformtoimproveauthorexperienceandparticipationatwormbase

Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase

Ejemplares similares