Cargando…

A manual corpus of annotated main findings of clinical case reports

Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present...

Descripción completa

Detalles Bibliográficos
Autores principales: Smalheiser, Neil R, Luo, Mengqi, Addepalli, Sidharth, Cui, Xiaokai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6335863/
https://www.ncbi.nlm.nih.gov/pubmed/30657910
http://dx.doi.org/10.1093/database/bay143
_version_ 1783387976029437952
author Smalheiser, Neil R
Luo, Mengqi
Addepalli, Sidharth
Cui, Xiaokai
author_facet Smalheiser, Neil R
Luo, Mengqi
Addepalli, Sidharth
Cui, Xiaokai
author_sort Smalheiser, Neil R
collection PubMed
description Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present the results of manual annotation carried out by two individuals on 500 randomly sampled case reports. This corpus contains main finding sentences extracted from title, abstract and full-text of the same article that can be regarded as semantically related and are often paraphrases. The final reconciled corpus of 416 articles comprises an open resource for further study. This is the first step in establishing text mining models and tools that can identify main finding sentences in an automated fashion, and in measuring quantitatively how similar any two main findings are. We envision that case reports in PubMed may be automatically indexed by main finding, so that users can carry out information queries for specific main findings (rather than general topics)—and given one case report, a user can retrieve those having the most similar main findings. The metric of main finding similarity may also potentially be relevant to the modeling of paraphrasing, summarization and entailment within the biomedical literature.
format Online
Article
Text
id pubmed-6335863
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63358632019-01-24 A manual corpus of annotated main findings of clinical case reports Smalheiser, Neil R Luo, Mengqi Addepalli, Sidharth Cui, Xiaokai Database (Oxford) Technical Report Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present the results of manual annotation carried out by two individuals on 500 randomly sampled case reports. This corpus contains main finding sentences extracted from title, abstract and full-text of the same article that can be regarded as semantically related and are often paraphrases. The final reconciled corpus of 416 articles comprises an open resource for further study. This is the first step in establishing text mining models and tools that can identify main finding sentences in an automated fashion, and in measuring quantitatively how similar any two main findings are. We envision that case reports in PubMed may be automatically indexed by main finding, so that users can carry out information queries for specific main findings (rather than general topics)—and given one case report, a user can retrieve those having the most similar main findings. The metric of main finding similarity may also potentially be relevant to the modeling of paraphrasing, summarization and entailment within the biomedical literature. Oxford University Press 2019-01-17 /pmc/articles/PMC6335863/ /pubmed/30657910 http://dx.doi.org/10.1093/database/bay143 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Report
Smalheiser, Neil R
Luo, Mengqi
Addepalli, Sidharth
Cui, Xiaokai
A manual corpus of annotated main findings of clinical case reports
title A manual corpus of annotated main findings of clinical case reports
title_full A manual corpus of annotated main findings of clinical case reports
title_fullStr A manual corpus of annotated main findings of clinical case reports
title_full_unstemmed A manual corpus of annotated main findings of clinical case reports
title_short A manual corpus of annotated main findings of clinical case reports
title_sort manual corpus of annotated main findings of clinical case reports
topic Technical Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6335863/
https://www.ncbi.nlm.nih.gov/pubmed/30657910
http://dx.doi.org/10.1093/database/bay143
work_keys_str_mv AT smalheiserneilr amanualcorpusofannotatedmainfindingsofclinicalcasereports
AT luomengqi amanualcorpusofannotatedmainfindingsofclinicalcasereports
AT addepallisidharth amanualcorpusofannotatedmainfindingsofclinicalcasereports
AT cuixiaokai amanualcorpusofannotatedmainfindingsofclinicalcasereports
AT smalheiserneilr manualcorpusofannotatedmainfindingsofclinicalcasereports
AT luomengqi manualcorpusofannotatedmainfindingsofclinicalcasereports
AT addepallisidharth manualcorpusofannotatedmainfindingsofclinicalcasereports
AT cuixiaokai manualcorpusofannotatedmainfindingsofclinicalcasereports