Cargando…

Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts

BACKGROUND: Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedic...

Descripción completa

Detalles Bibliográficos
Autores principales: Chun, Hong-Woo, Tsuruoka, Yoshimasa, Kim, Jin-Dong, Shiba, Rie, Nagata, Naoki, Hishiki, Teruyoshi, Tsujii, Jun'ichi
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1764448/
https://www.ncbi.nlm.nih.gov/pubmed/17134477
http://dx.doi.org/10.1186/1471-2105-7-S3-S4
_version_ 1782131615799967744
author Chun, Hong-Woo
Tsuruoka, Yoshimasa
Kim, Jin-Dong
Shiba, Rie
Nagata, Naoki
Hishiki, Teruyoshi
Tsujii, Jun'ichi
author_facet Chun, Hong-Woo
Tsuruoka, Yoshimasa
Kim, Jin-Dong
Shiba, Rie
Nagata, Naoki
Hishiki, Teruyoshi
Tsujii, Jun'ichi
author_sort Chun, Hong-Woo
collection PubMed
description BACKGROUND: Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. METHODS: We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. RESULTS: Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. CONCLUSION: A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques.
format Text
id pubmed-1764448
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17644482007-01-09 Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts Chun, Hong-Woo Tsuruoka, Yoshimasa Kim, Jin-Dong Shiba, Rie Nagata, Naoki Hishiki, Teruyoshi Tsujii, Jun'ichi BMC Bioinformatics Proceedings BACKGROUND: Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. METHODS: We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. RESULTS: Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. CONCLUSION: A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques. BioMed Central 2006-11-24 /pmc/articles/PMC1764448/ /pubmed/17134477 http://dx.doi.org/10.1186/1471-2105-7-S3-S4 Text en Copyright © 2006 Chun et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Chun, Hong-Woo
Tsuruoka, Yoshimasa
Kim, Jin-Dong
Shiba, Rie
Nagata, Naoki
Hishiki, Teruyoshi
Tsujii, Jun'ichi
Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts
title Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts
title_full Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts
title_fullStr Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts
title_full_unstemmed Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts
title_short Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts
title_sort automatic recognition of topic-classified relations between prostate cancer and genes using medline abstracts
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1764448/
https://www.ncbi.nlm.nih.gov/pubmed/17134477
http://dx.doi.org/10.1186/1471-2105-7-S3-S4
work_keys_str_mv AT chunhongwoo automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts
AT tsuruokayoshimasa automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts
AT kimjindong automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts
AT shibarie automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts
AT nagatanaoki automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts
AT hishikiteruyoshi automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts
AT tsujiijunichi automaticrecognitionoftopicclassifiedrelationsbetweenprostatecancerandgenesusingmedlineabstracts