Cargando…

Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation

In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric m...

Descripción completa

Detalles Bibliográficos
Autores principales: Smalheiser, Neil R., Bonifield, Gary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: University of Illinois at Chicago Library 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845330/
https://www.ncbi.nlm.nih.gov/pubmed/27213780
http://dx.doi.org/10.5210/disco.v7i0.6654
_version_ 1782428921914982400
author Smalheiser, Neil R.
Bonifield, Gary
author_facet Smalheiser, Neil R.
Bonifield, Gary
author_sort Smalheiser, Neil R.
collection PubMed
description In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric measures the tendency of two MeSH terms to appear in the body of articles written by the same individual (using the 2009 Author-ity author name disambiguation dataset as a gold standard). The two metrics are only modestly correlated with each other (r = 0.50), indicating that they capture different aspects of term usage. The article-based metric provides a measure of semantic relatedness, and MeSH term pairs that co-occur more often than expected by chance may reflect relations between the two terms. In contrast, the author metric is indicative of how individuals practice science, and may have value for author name disambiguation and studies of scientific discovery. We have calculated article metrics for all MeSH terms appearing in at least 25 articles in MEDLINE (as of 2014) and author metrics for MeSH terms published as of 2009. The dataset is freely available for download and can be queried at http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html. Handling editor: Elizabeth Workman, MLIS, PhD.
format Online
Article
Text
id pubmed-4845330
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher University of Illinois at Chicago Library
record_format MEDLINE/PubMed
spelling pubmed-48453302016-04-27 Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation Smalheiser, Neil R. Bonifield, Gary J Biomed Discov Collab Research Article In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric measures the tendency of two MeSH terms to appear in the body of articles written by the same individual (using the 2009 Author-ity author name disambiguation dataset as a gold standard). The two metrics are only modestly correlated with each other (r = 0.50), indicating that they capture different aspects of term usage. The article-based metric provides a measure of semantic relatedness, and MeSH term pairs that co-occur more often than expected by chance may reflect relations between the two terms. In contrast, the author metric is indicative of how individuals practice science, and may have value for author name disambiguation and studies of scientific discovery. We have calculated article metrics for all MeSH terms appearing in at least 25 articles in MEDLINE (as of 2014) and author metrics for MeSH terms published as of 2009. The dataset is freely available for download and can be queried at http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html. Handling editor: Elizabeth Workman, MLIS, PhD. University of Illinois at Chicago Library 2016-04-15 /pmc/articles/PMC4845330/ /pubmed/27213780 http://dx.doi.org/10.5210/disco.v7i0.6654 Text en This is an Open Access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Smalheiser, Neil R.
Bonifield, Gary
Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation
title Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation
title_full Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation
title_fullStr Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation
title_full_unstemmed Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation
title_short Two Similarity Metrics for Medical Subject Headings (MeSH):: An Aid to Biomedical Text Mining and Author Name Disambiguation
title_sort two similarity metrics for medical subject headings (mesh):: an aid to biomedical text mining and author name disambiguation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845330/
https://www.ncbi.nlm.nih.gov/pubmed/27213780
http://dx.doi.org/10.5210/disco.v7i0.6654
work_keys_str_mv AT smalheiserneilr twosimilaritymetricsformedicalsubjectheadingsmeshanaidtobiomedicaltextminingandauthornamedisambiguation
AT bonifieldgary twosimilaritymetricsformedicalsubjectheadingsmeshanaidtobiomedicaltextminingandauthornamedisambiguation