Cargando…
Sentence retrieval for abstracts of randomized controlled trials
BACKGROUND: The practice of evidence-based medicine (EBM) requires clinicians to integrate their expertise with the latest scientific research. But this is becoming increasingly difficult with the growing numbers of published articles. There is a clear need for better tools to improve clinician'...
Autor principal: | |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2657779/ https://www.ncbi.nlm.nih.gov/pubmed/19208256 http://dx.doi.org/10.1186/1472-6947-9-10 |
_version_ | 1782165613339213824 |
---|---|
author | Chung, Grace Y |
author_facet | Chung, Grace Y |
author_sort | Chung, Grace Y |
collection | PubMed |
description | BACKGROUND: The practice of evidence-based medicine (EBM) requires clinicians to integrate their expertise with the latest scientific research. But this is becoming increasingly difficult with the growing numbers of published articles. There is a clear need for better tools to improve clinician's ability to search the primary literature. Randomized clinical trials (RCTs) are the most reliable source of evidence documenting the efficacy of treatment options. This paper describes the retrieval of key sentences from abstracts of RCTs as a step towards helping users find relevant facts about the experimental design of clinical studies. METHOD: Using Conditional Random Fields (CRFs), a popular and successful method for natural language processing problems, sentences referring to Intervention, Participants and Outcome Measures are automatically categorized. This is done by extending a previous approach for labeling sentences in an abstract for general categories associated with scientific argumentation or rhetorical roles: Aim, Method, Results and Conclusion. Methods are tested on several corpora of RCT abstracts. First structured abstracts with headings specifically indicating Intervention, Participant and Outcome Measures are used. Also a manually annotated corpus of structured and unstructured abstracts is prepared for testing a classifier that identifies sentences belonging to each category. RESULTS: Using CRFs, sentences can be labeled for the four rhetorical roles with F-scores from 0.93–0.98. This outperforms the use of Support Vector Machines. Furthermore, sentences can be automatically labeled for Intervention, Participant and Outcome Measures, in unstructured and structured abstracts where the section headings do not specifically indicate these three topics. F-scores of up to 0.83 and 0.84 are obtained for Intervention and Outcome Measure sentences. CONCLUSION: Results indicate that some of the methodological elements of RCTs are identifiable at the sentence level in both structured and unstructured abstract reports. This is promising in that sentences labeled automatically could potentially form concise summaries, assist in information retrieval and finer-grained extraction. |
format | Text |
id | pubmed-2657779 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26577792009-03-19 Sentence retrieval for abstracts of randomized controlled trials Chung, Grace Y BMC Med Inform Decis Mak Research Article BACKGROUND: The practice of evidence-based medicine (EBM) requires clinicians to integrate their expertise with the latest scientific research. But this is becoming increasingly difficult with the growing numbers of published articles. There is a clear need for better tools to improve clinician's ability to search the primary literature. Randomized clinical trials (RCTs) are the most reliable source of evidence documenting the efficacy of treatment options. This paper describes the retrieval of key sentences from abstracts of RCTs as a step towards helping users find relevant facts about the experimental design of clinical studies. METHOD: Using Conditional Random Fields (CRFs), a popular and successful method for natural language processing problems, sentences referring to Intervention, Participants and Outcome Measures are automatically categorized. This is done by extending a previous approach for labeling sentences in an abstract for general categories associated with scientific argumentation or rhetorical roles: Aim, Method, Results and Conclusion. Methods are tested on several corpora of RCT abstracts. First structured abstracts with headings specifically indicating Intervention, Participant and Outcome Measures are used. Also a manually annotated corpus of structured and unstructured abstracts is prepared for testing a classifier that identifies sentences belonging to each category. RESULTS: Using CRFs, sentences can be labeled for the four rhetorical roles with F-scores from 0.93–0.98. This outperforms the use of Support Vector Machines. Furthermore, sentences can be automatically labeled for Intervention, Participant and Outcome Measures, in unstructured and structured abstracts where the section headings do not specifically indicate these three topics. F-scores of up to 0.83 and 0.84 are obtained for Intervention and Outcome Measure sentences. CONCLUSION: Results indicate that some of the methodological elements of RCTs are identifiable at the sentence level in both structured and unstructured abstract reports. This is promising in that sentences labeled automatically could potentially form concise summaries, assist in information retrieval and finer-grained extraction. BioMed Central 2009-02-10 /pmc/articles/PMC2657779/ /pubmed/19208256 http://dx.doi.org/10.1186/1472-6947-9-10 Text en Copyright ©2009 Chung; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Chung, Grace Y Sentence retrieval for abstracts of randomized controlled trials |
title | Sentence retrieval for abstracts of randomized controlled trials |
title_full | Sentence retrieval for abstracts of randomized controlled trials |
title_fullStr | Sentence retrieval for abstracts of randomized controlled trials |
title_full_unstemmed | Sentence retrieval for abstracts of randomized controlled trials |
title_short | Sentence retrieval for abstracts of randomized controlled trials |
title_sort | sentence retrieval for abstracts of randomized controlled trials |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2657779/ https://www.ncbi.nlm.nih.gov/pubmed/19208256 http://dx.doi.org/10.1186/1472-6947-9-10 |
work_keys_str_mv | AT chunggracey sentenceretrievalforabstractsofrandomizedcontrolledtrials |