Cargando…

The GNAT library for local and remote gene mention normalization

Summary: Identifying mentions of named entities, such as genes or diseases, and normalizing them to database identifiers have become an important step in many text and data mining pipelines. Despite this need, very few entity normalization systems are publicly available as source code or web service...

Descripción completa

Detalles Bibliográficos
Autores principales: Hakenberg, Jörg, Gerner, Martin, Haeussler, Maximilian, Solt, Illés, Plake, Conrad, Schroeder, Michael, Gonzalez, Graciela, Nenadic, Goran, Bergman, Casey M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3179658/
https://www.ncbi.nlm.nih.gov/pubmed/21813477
http://dx.doi.org/10.1093/bioinformatics/btr455
_version_ 1782212538947076096
author Hakenberg, Jörg
Gerner, Martin
Haeussler, Maximilian
Solt, Illés
Plake, Conrad
Schroeder, Michael
Gonzalez, Graciela
Nenadic, Goran
Bergman, Casey M.
author_facet Hakenberg, Jörg
Gerner, Martin
Haeussler, Maximilian
Solt, Illés
Plake, Conrad
Schroeder, Michael
Gonzalez, Graciela
Nenadic, Goran
Bergman, Casey M.
author_sort Hakenberg, Jörg
collection PubMed
description Summary: Identifying mentions of named entities, such as genes or diseases, and normalizing them to database identifiers have become an important step in many text and data mining pipelines. Despite this need, very few entity normalization systems are publicly available as source code or web services for biomedical text mining. Here we present the Gnat Java library for text retrieval, named entity recognition, and normalization of gene and protein mentions in biomedical text. The library can be used as a component to be integrated with other text-mining systems, as a framework to add user-specific extensions, and as an efficient stand-alone application for the identification of gene and protein names for data analysis. On the BioCreative III test data, the current version of Gnat achieves a Tap-20 score of 0.1987. Availability: The library and web services are implemented in Java and the sources are available from http://gnat.sourceforge.net. Contact: jorg.hakenberg@roche.com
format Online
Article
Text
id pubmed-3179658
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31796582011-09-26 The GNAT library for local and remote gene mention normalization Hakenberg, Jörg Gerner, Martin Haeussler, Maximilian Solt, Illés Plake, Conrad Schroeder, Michael Gonzalez, Graciela Nenadic, Goran Bergman, Casey M. Bioinformatics Applications Note Summary: Identifying mentions of named entities, such as genes or diseases, and normalizing them to database identifiers have become an important step in many text and data mining pipelines. Despite this need, very few entity normalization systems are publicly available as source code or web services for biomedical text mining. Here we present the Gnat Java library for text retrieval, named entity recognition, and normalization of gene and protein mentions in biomedical text. The library can be used as a component to be integrated with other text-mining systems, as a framework to add user-specific extensions, and as an efficient stand-alone application for the identification of gene and protein names for data analysis. On the BioCreative III test data, the current version of Gnat achieves a Tap-20 score of 0.1987. Availability: The library and web services are implemented in Java and the sources are available from http://gnat.sourceforge.net. Contact: jorg.hakenberg@roche.com Oxford University Press 2011-10-01 2011-08-03 /pmc/articles/PMC3179658/ /pubmed/21813477 http://dx.doi.org/10.1093/bioinformatics/btr455 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Hakenberg, Jörg
Gerner, Martin
Haeussler, Maximilian
Solt, Illés
Plake, Conrad
Schroeder, Michael
Gonzalez, Graciela
Nenadic, Goran
Bergman, Casey M.
The GNAT library for local and remote gene mention normalization
title The GNAT library for local and remote gene mention normalization
title_full The GNAT library for local and remote gene mention normalization
title_fullStr The GNAT library for local and remote gene mention normalization
title_full_unstemmed The GNAT library for local and remote gene mention normalization
title_short The GNAT library for local and remote gene mention normalization
title_sort gnat library for local and remote gene mention normalization
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3179658/
https://www.ncbi.nlm.nih.gov/pubmed/21813477
http://dx.doi.org/10.1093/bioinformatics/btr455
work_keys_str_mv AT hakenbergjorg thegnatlibraryforlocalandremotegenementionnormalization
AT gernermartin thegnatlibraryforlocalandremotegenementionnormalization
AT haeusslermaximilian thegnatlibraryforlocalandremotegenementionnormalization
AT soltilles thegnatlibraryforlocalandremotegenementionnormalization
AT plakeconrad thegnatlibraryforlocalandremotegenementionnormalization
AT schroedermichael thegnatlibraryforlocalandremotegenementionnormalization
AT gonzalezgraciela thegnatlibraryforlocalandremotegenementionnormalization
AT nenadicgoran thegnatlibraryforlocalandremotegenementionnormalization
AT bergmancaseym thegnatlibraryforlocalandremotegenementionnormalization
AT hakenbergjorg gnatlibraryforlocalandremotegenementionnormalization
AT gernermartin gnatlibraryforlocalandremotegenementionnormalization
AT haeusslermaximilian gnatlibraryforlocalandremotegenementionnormalization
AT soltilles gnatlibraryforlocalandremotegenementionnormalization
AT plakeconrad gnatlibraryforlocalandremotegenementionnormalization
AT schroedermichael gnatlibraryforlocalandremotegenementionnormalization
AT gonzalezgraciela gnatlibraryforlocalandremotegenementionnormalization
AT nenadicgoran gnatlibraryforlocalandremotegenementionnormalization
AT bergmancaseym gnatlibraryforlocalandremotegenementionnormalization