Cargando…

Figure-Associated Text Summarization and Evaluation

Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these fig...

Descripción completa

Detalles Bibliográficos
Autores principales: Polepalli Ramesh, Balaji, Sethi, Ricky J., Yu, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4313946/
https://www.ncbi.nlm.nih.gov/pubmed/25643357
http://dx.doi.org/10.1371/journal.pone.0115671
_version_ 1782355284601077760
author Polepalli Ramesh, Balaji
Sethi, Ricky J.
Yu, Hong
author_facet Polepalli Ramesh, Balaji
Sethi, Ricky J.
Yu, Hong
author_sort Polepalli Ramesh, Balaji
collection PubMed
description Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903).
format Online
Article
Text
id pubmed-4313946
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43139462015-02-13 Figure-Associated Text Summarization and Evaluation Polepalli Ramesh, Balaji Sethi, Ricky J. Yu, Hong PLoS One Research Article Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903). Public Library of Science 2015-02-02 /pmc/articles/PMC4313946/ /pubmed/25643357 http://dx.doi.org/10.1371/journal.pone.0115671 Text en © 2015 Polepalli Ramesh et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Polepalli Ramesh, Balaji
Sethi, Ricky J.
Yu, Hong
Figure-Associated Text Summarization and Evaluation
title Figure-Associated Text Summarization and Evaluation
title_full Figure-Associated Text Summarization and Evaluation
title_fullStr Figure-Associated Text Summarization and Evaluation
title_full_unstemmed Figure-Associated Text Summarization and Evaluation
title_short Figure-Associated Text Summarization and Evaluation
title_sort figure-associated text summarization and evaluation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4313946/
https://www.ncbi.nlm.nih.gov/pubmed/25643357
http://dx.doi.org/10.1371/journal.pone.0115671
work_keys_str_mv AT polepallirameshbalaji figureassociatedtextsummarizationandevaluation
AT sethirickyj figureassociatedtextsummarizationandevaluation
AT yuhong figureassociatedtextsummarizationandevaluation