Cargando…

Recognition of protein/gene names from text using an ensemble of classifiers

This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modul...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, GuoDong, Shen, Dan, Zhang, Jie, Su, Jian, Tan, SoonHeng
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869021/
https://www.ncbi.nlm.nih.gov/pubmed/15960841
http://dx.doi.org/10.1186/1471-2105-6-S1-S7
_version_ 1782133430041968640
author Zhou, GuoDong
Shen, Dan
Zhang, Jie
Su, Jian
Tan, SoonHeng
author_facet Zhou, GuoDong
Shen, Dan
Zhang, Jie
Su, Jian
Tan, SoonHeng
author_sort Zhou, GuoDong
collection PubMed
description This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A).
format Text
id pubmed-1869021
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18690212007-05-18 Recognition of protein/gene names from text using an ensemble of classifiers Zhou, GuoDong Shen, Dan Zhang, Jie Su, Jian Tan, SoonHeng BMC Bioinformatics Report This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A). BioMed Central 2005-05-24 /pmc/articles/PMC1869021/ /pubmed/15960841 http://dx.doi.org/10.1186/1471-2105-6-S1-S7 Text en Copyright © 2005 Zhou et al; licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Report
Zhou, GuoDong
Shen, Dan
Zhang, Jie
Su, Jian
Tan, SoonHeng
Recognition of protein/gene names from text using an ensemble of classifiers
title Recognition of protein/gene names from text using an ensemble of classifiers
title_full Recognition of protein/gene names from text using an ensemble of classifiers
title_fullStr Recognition of protein/gene names from text using an ensemble of classifiers
title_full_unstemmed Recognition of protein/gene names from text using an ensemble of classifiers
title_short Recognition of protein/gene names from text using an ensemble of classifiers
title_sort recognition of protein/gene names from text using an ensemble of classifiers
topic Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869021/
https://www.ncbi.nlm.nih.gov/pubmed/15960841
http://dx.doi.org/10.1186/1471-2105-6-S1-S7
work_keys_str_mv AT zhouguodong recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT shendan recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT zhangjie recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT sujian recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT tansoonheng recognitionofproteingenenamesfromtextusinganensembleofclassifiers