Cargando…
A manual corpus of annotated main findings of clinical case reports
Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6335863/ https://www.ncbi.nlm.nih.gov/pubmed/30657910 http://dx.doi.org/10.1093/database/bay143 |
_version_ | 1783387976029437952 |
---|---|
author | Smalheiser, Neil R Luo, Mengqi Addepalli, Sidharth Cui, Xiaokai |
author_facet | Smalheiser, Neil R Luo, Mengqi Addepalli, Sidharth Cui, Xiaokai |
author_sort | Smalheiser, Neil R |
collection | PubMed |
description | Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present the results of manual annotation carried out by two individuals on 500 randomly sampled case reports. This corpus contains main finding sentences extracted from title, abstract and full-text of the same article that can be regarded as semantically related and are often paraphrases. The final reconciled corpus of 416 articles comprises an open resource for further study. This is the first step in establishing text mining models and tools that can identify main finding sentences in an automated fashion, and in measuring quantitatively how similar any two main findings are. We envision that case reports in PubMed may be automatically indexed by main finding, so that users can carry out information queries for specific main findings (rather than general topics)—and given one case report, a user can retrieve those having the most similar main findings. The metric of main finding similarity may also potentially be relevant to the modeling of paraphrasing, summarization and entailment within the biomedical literature. |
format | Online Article Text |
id | pubmed-6335863 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-63358632019-01-24 A manual corpus of annotated main findings of clinical case reports Smalheiser, Neil R Luo, Mengqi Addepalli, Sidharth Cui, Xiaokai Database (Oxford) Technical Report Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present the results of manual annotation carried out by two individuals on 500 randomly sampled case reports. This corpus contains main finding sentences extracted from title, abstract and full-text of the same article that can be regarded as semantically related and are often paraphrases. The final reconciled corpus of 416 articles comprises an open resource for further study. This is the first step in establishing text mining models and tools that can identify main finding sentences in an automated fashion, and in measuring quantitatively how similar any two main findings are. We envision that case reports in PubMed may be automatically indexed by main finding, so that users can carry out information queries for specific main findings (rather than general topics)—and given one case report, a user can retrieve those having the most similar main findings. The metric of main finding similarity may also potentially be relevant to the modeling of paraphrasing, summarization and entailment within the biomedical literature. Oxford University Press 2019-01-17 /pmc/articles/PMC6335863/ /pubmed/30657910 http://dx.doi.org/10.1093/database/bay143 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Report Smalheiser, Neil R Luo, Mengqi Addepalli, Sidharth Cui, Xiaokai A manual corpus of annotated main findings of clinical case reports |
title | A manual corpus of annotated main findings of clinical case reports |
title_full | A manual corpus of annotated main findings of clinical case reports |
title_fullStr | A manual corpus of annotated main findings of clinical case reports |
title_full_unstemmed | A manual corpus of annotated main findings of clinical case reports |
title_short | A manual corpus of annotated main findings of clinical case reports |
title_sort | manual corpus of annotated main findings of clinical case reports |
topic | Technical Report |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6335863/ https://www.ncbi.nlm.nih.gov/pubmed/30657910 http://dx.doi.org/10.1093/database/bay143 |
work_keys_str_mv | AT smalheiserneilr amanualcorpusofannotatedmainfindingsofclinicalcasereports AT luomengqi amanualcorpusofannotatedmainfindingsofclinicalcasereports AT addepallisidharth amanualcorpusofannotatedmainfindingsofclinicalcasereports AT cuixiaokai amanualcorpusofannotatedmainfindingsofclinicalcasereports AT smalheiserneilr manualcorpusofannotatedmainfindingsofclinicalcasereports AT luomengqi manualcorpusofannotatedmainfindingsofclinicalcasereports AT addepallisidharth manualcorpusofannotatedmainfindingsofclinicalcasereports AT cuixiaokai manualcorpusofannotatedmainfindingsofclinicalcasereports |