Cargando…

Learning to rank-based gene summary extraction

BACKGROUND: In recent years, the biomedical literature has been growing rapidly. These articles provide a large amount of information about proteins, genes and their interactions. Reading such a huge amount of literature is a tedious task for researchers to gain knowledge about a gene. As a result,...

Descripción completa

Detalles Bibliográficos
Autores principales: Shang, Yue, Hao, Huihui, Wu, Jiajin, Lin, Hongfei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243090/
https://www.ncbi.nlm.nih.gov/pubmed/25474678
http://dx.doi.org/10.1186/1471-2105-15-S12-S10
_version_ 1782346056922562560
author Shang, Yue
Hao, Huihui
Wu, Jiajin
Lin, Hongfei
author_facet Shang, Yue
Hao, Huihui
Wu, Jiajin
Lin, Hongfei
author_sort Shang, Yue
collection PubMed
description BACKGROUND: In recent years, the biomedical literature has been growing rapidly. These articles provide a large amount of information about proteins, genes and their interactions. Reading such a huge amount of literature is a tedious task for researchers to gain knowledge about a gene. As a result, it is significant for biomedical researchers to have a quick understanding of the query concept by integrating its relevant resources. METHODS: In the task of gene summary generation, we regard automatic summary as a ranking problem and apply the method of learning to rank to automatically solve this problem. This paper uses three features as a basis for sentence selection: gene ontology relevance, topic relevance and TextRank. From there, we obtain the feature weight vector using the learning to rank algorithm and predict the scores of candidate summary sentences and obtain top sentences to generate the summary. RESULTS: ROUGE (a toolkit for summarization of automatic evaluation) was used to evaluate the summarization result and the experimental results showed that our method outperforms the baseline techniques. CONCLUSIONS: According to the experimental result, the combination of three features can improve the performance of summary. The application of learning to rank can facilitate the further expansion of features for measuring the significance of sentences.
format Online
Article
Text
id pubmed-4243090
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42430902014-11-26 Learning to rank-based gene summary extraction Shang, Yue Hao, Huihui Wu, Jiajin Lin, Hongfei BMC Bioinformatics Research BACKGROUND: In recent years, the biomedical literature has been growing rapidly. These articles provide a large amount of information about proteins, genes and their interactions. Reading such a huge amount of literature is a tedious task for researchers to gain knowledge about a gene. As a result, it is significant for biomedical researchers to have a quick understanding of the query concept by integrating its relevant resources. METHODS: In the task of gene summary generation, we regard automatic summary as a ranking problem and apply the method of learning to rank to automatically solve this problem. This paper uses three features as a basis for sentence selection: gene ontology relevance, topic relevance and TextRank. From there, we obtain the feature weight vector using the learning to rank algorithm and predict the scores of candidate summary sentences and obtain top sentences to generate the summary. RESULTS: ROUGE (a toolkit for summarization of automatic evaluation) was used to evaluate the summarization result and the experimental results showed that our method outperforms the baseline techniques. CONCLUSIONS: According to the experimental result, the combination of three features can improve the performance of summary. The application of learning to rank can facilitate the further expansion of features for measuring the significance of sentences. BioMed Central 2014-11-06 /pmc/articles/PMC4243090/ /pubmed/25474678 http://dx.doi.org/10.1186/1471-2105-15-S12-S10 Text en Copyright © 2014 Shang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Shang, Yue
Hao, Huihui
Wu, Jiajin
Lin, Hongfei
Learning to rank-based gene summary extraction
title Learning to rank-based gene summary extraction
title_full Learning to rank-based gene summary extraction
title_fullStr Learning to rank-based gene summary extraction
title_full_unstemmed Learning to rank-based gene summary extraction
title_short Learning to rank-based gene summary extraction
title_sort learning to rank-based gene summary extraction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243090/
https://www.ncbi.nlm.nih.gov/pubmed/25474678
http://dx.doi.org/10.1186/1471-2105-15-S12-S10
work_keys_str_mv AT shangyue learningtorankbasedgenesummaryextraction
AT haohuihui learningtorankbasedgenesummaryextraction
AT wujiajin learningtorankbasedgenesummaryextraction
AT linhongfei learningtorankbasedgenesummaryextraction