Cargando…
Finding related sentence pairs in MEDLINE
We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Springer Netherlands
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2992462/ https://www.ncbi.nlm.nih.gov/pubmed/21170415 http://dx.doi.org/10.1007/s10791-010-9126-8 |
_version_ | 1782192740629479424 |
---|---|
author | Smith, Larry H. Wilbur, W. John |
author_facet | Smith, Larry H. Wilbur, W. John |
author_sort | Smith, Larry H. |
collection | PubMed |
description | We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant of Support Vector Machines which minimizes the modified Huber loss function, achieves 73% precision when the score cutoff is set high enough to identify about one related sentence per abstract on average. We illustrate how an abstract viewed in PubMed might be modified to present the related sentences found in other abstracts by this automatic procedure. |
format | Text |
id | pubmed-2992462 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Springer Netherlands |
record_format | MEDLINE/PubMed |
spelling | pubmed-29924622010-12-15 Finding related sentence pairs in MEDLINE Smith, Larry H. Wilbur, W. John Inf Retr Boston Article We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant of Support Vector Machines which minimizes the modified Huber loss function, achieves 73% precision when the score cutoff is set high enough to identify about one related sentence per abstract on average. We illustrate how an abstract viewed in PubMed might be modified to present the related sentences found in other abstracts by this automatic procedure. Springer Netherlands 2010-01-23 2010 /pmc/articles/PMC2992462/ /pubmed/21170415 http://dx.doi.org/10.1007/s10791-010-9126-8 Text en © The Author(s) 2010 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. |
spellingShingle | Article Smith, Larry H. Wilbur, W. John Finding related sentence pairs in MEDLINE |
title | Finding related sentence pairs in MEDLINE |
title_full | Finding related sentence pairs in MEDLINE |
title_fullStr | Finding related sentence pairs in MEDLINE |
title_full_unstemmed | Finding related sentence pairs in MEDLINE |
title_short | Finding related sentence pairs in MEDLINE |
title_sort | finding related sentence pairs in medline |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2992462/ https://www.ncbi.nlm.nih.gov/pubmed/21170415 http://dx.doi.org/10.1007/s10791-010-9126-8 |
work_keys_str_mv | AT smithlarryh findingrelatedsentencepairsinmedline AT wilburwjohn findingrelatedsentencepairsinmedline |