Cargando…

Finding related sentence pairs in MEDLINE

We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Larry H., Wilbur, W. John
Formato: Texto
Lenguaje:English
Publicado: Springer Netherlands 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2992462/
https://www.ncbi.nlm.nih.gov/pubmed/21170415
http://dx.doi.org/10.1007/s10791-010-9126-8
_version_ 1782192740629479424
author Smith, Larry H.
Wilbur, W. John
author_facet Smith, Larry H.
Wilbur, W. John
author_sort Smith, Larry H.
collection PubMed
description We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant of Support Vector Machines which minimizes the modified Huber loss function, achieves 73% precision when the score cutoff is set high enough to identify about one related sentence per abstract on average. We illustrate how an abstract viewed in PubMed might be modified to present the related sentences found in other abstracts by this automatic procedure.
format Text
id pubmed-2992462
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-29924622010-12-15 Finding related sentence pairs in MEDLINE Smith, Larry H. Wilbur, W. John Inf Retr Boston Article We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant of Support Vector Machines which minimizes the modified Huber loss function, achieves 73% precision when the score cutoff is set high enough to identify about one related sentence per abstract on average. We illustrate how an abstract viewed in PubMed might be modified to present the related sentences found in other abstracts by this automatic procedure. Springer Netherlands 2010-01-23 2010 /pmc/articles/PMC2992462/ /pubmed/21170415 http://dx.doi.org/10.1007/s10791-010-9126-8 Text en © The Author(s) 2010 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
spellingShingle Article
Smith, Larry H.
Wilbur, W. John
Finding related sentence pairs in MEDLINE
title Finding related sentence pairs in MEDLINE
title_full Finding related sentence pairs in MEDLINE
title_fullStr Finding related sentence pairs in MEDLINE
title_full_unstemmed Finding related sentence pairs in MEDLINE
title_short Finding related sentence pairs in MEDLINE
title_sort finding related sentence pairs in medline
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2992462/
https://www.ncbi.nlm.nih.gov/pubmed/21170415
http://dx.doi.org/10.1007/s10791-010-9126-8
work_keys_str_mv AT smithlarryh findingrelatedsentencepairsinmedline
AT wilburwjohn findingrelatedsentencepairsinmedline