Cargando…

Recognizing speculative language in biomedical research articles: a linguistically motivated perspective

BACKGROUND: Due to the nature of scientific methodology, research articles are rich in speculative and tentative statements, also known as hedges. We explore a linguistically motivated approach to the problem of recognizing such language in biomedical research articles. Our approach draws on prior l...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kilicoglu, Halil, Bergler, Sabine
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586760/ https://www.ncbi.nlm.nih.gov/pubmed/19025686 http://dx.doi.org/10.1186/1471-2105-9-S11-S10

_version_	1782160910288158720
author	Kilicoglu, Halil Bergler, Sabine
author_facet	Kilicoglu, Halil Bergler, Sabine
author_sort	Kilicoglu, Halil
collection	PubMed
description	BACKGROUND: Due to the nature of scientific methodology, research articles are rich in speculative and tentative statements, also known as hedges. We explore a linguistically motivated approach to the problem of recognizing such language in biomedical research articles. Our approach draws on prior linguistic work as well as existing lexical resources to create a dictionary of hedging cues and extends it by introducing syntactic patterns. Furthermore, recognizing that hedging cues differ in speculative strength, we assign them weights in two ways: automatically using the information gain (IG) measure and semi-automatically based on their types and centrality to hedging. Weights of hedging cues are used to determine the speculative strength of sentences. RESULTS: We test our system on two publicly available hedging datasets. On the fruit-fly dataset, we achieve a precision-recall breakeven point (BEP) of 0.85 using the semi-automatic weighting scheme and a lower BEP of 0.80 with the information gain weighting scheme. These results are competitive with the previously reported best results (BEP of 0.85). On the BMC dataset, using semi-automatic weighting yields a BEP of 0.82, a statistically significant improvement (p <0.01) over the previously reported best result (BEP of 0.76), while information gain weighting yields a BEP of 0.70. CONCLUSION: Our results demonstrate that speculative language can be recognized successfully with a linguistically motivated approach and confirms that selection of hedging devices affects the speculative strength of the sentence, which can be captured reasonably by weighting the hedging cues. The improvement obtained on the BMC dataset with a semi-automatic weighting scheme indicates that our linguistically oriented approach is more portable than the machine-learning based approaches. Lower performance obtained with the information gain weighting scheme suggests that this method may benefit from a larger, manually annotated corpus for automatically inducing the weights.
format	Text
id	pubmed-2586760
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-25867602008-11-26 Recognizing speculative language in biomedical research articles: a linguistically motivated perspective Kilicoglu, Halil Bergler, Sabine BMC Bioinformatics Research BACKGROUND: Due to the nature of scientific methodology, research articles are rich in speculative and tentative statements, also known as hedges. We explore a linguistically motivated approach to the problem of recognizing such language in biomedical research articles. Our approach draws on prior linguistic work as well as existing lexical resources to create a dictionary of hedging cues and extends it by introducing syntactic patterns. Furthermore, recognizing that hedging cues differ in speculative strength, we assign them weights in two ways: automatically using the information gain (IG) measure and semi-automatically based on their types and centrality to hedging. Weights of hedging cues are used to determine the speculative strength of sentences. RESULTS: We test our system on two publicly available hedging datasets. On the fruit-fly dataset, we achieve a precision-recall breakeven point (BEP) of 0.85 using the semi-automatic weighting scheme and a lower BEP of 0.80 with the information gain weighting scheme. These results are competitive with the previously reported best results (BEP of 0.85). On the BMC dataset, using semi-automatic weighting yields a BEP of 0.82, a statistically significant improvement (p <0.01) over the previously reported best result (BEP of 0.76), while information gain weighting yields a BEP of 0.70. CONCLUSION: Our results demonstrate that speculative language can be recognized successfully with a linguistically motivated approach and confirms that selection of hedging devices affects the speculative strength of the sentence, which can be captured reasonably by weighting the hedging cues. The improvement obtained on the BMC dataset with a semi-automatic weighting scheme indicates that our linguistically oriented approach is more portable than the machine-learning based approaches. Lower performance obtained with the information gain weighting scheme suggests that this method may benefit from a larger, manually annotated corpus for automatically inducing the weights. BioMed Central 2008-11-19 /pmc/articles/PMC2586760/ /pubmed/19025686 http://dx.doi.org/10.1186/1471-2105-9-S11-S10 Text en Copyright © 2008 Kilicoglu and Bergler; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Kilicoglu, Halil Bergler, Sabine Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
title	Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
title_full	Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
title_fullStr	Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
title_full_unstemmed	Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
title_short	Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
title_sort	recognizing speculative language in biomedical research articles: a linguistically motivated perspective
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586760/ https://www.ncbi.nlm.nih.gov/pubmed/19025686 http://dx.doi.org/10.1186/1471-2105-9-S11-S10
work_keys_str_mv	AT kilicogluhalil recognizingspeculativelanguageinbiomedicalresearcharticlesalinguisticallymotivatedperspective AT berglersabine recognizingspeculativelanguageinbiomedicalresearcharticlesalinguisticallymotivatedperspective

Recognizing speculative language in biomedical research articles: a linguistically motivated perspective

Ejemplares similares