Cargando…
Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing
OBJECTIVES: Measurement of similarities between documents is typically influenced by the sparseness of the term-document matrix employed. Latent semantic indexing (LSI) may improve the results of this type of analysis. METHODS: In this study, LSI was utilized in an attempt to reduce the term vector...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Korean Society of Medical Informatics
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3092990/ https://www.ncbi.nlm.nih.gov/pubmed/21818454 http://dx.doi.org/10.4258/hir.2011.17.1.24 |
_version_ | 1782203423315197952 |
---|---|
author | Han, Choonghyun Yoo, Sooyoung Choi, Jinwook |
author_facet | Han, Choonghyun Yoo, Sooyoung Choi, Jinwook |
author_sort | Han, Choonghyun |
collection | PubMed |
description | OBJECTIVES: Measurement of similarities between documents is typically influenced by the sparseness of the term-document matrix employed. Latent semantic indexing (LSI) may improve the results of this type of analysis. METHODS: In this study, LSI was utilized in an attempt to reduce the term vector space of clinical documents and newspaper editorials. RESULTS: After applying LSI, document similarities were revealed more clearly in clinical documents than editorials. Clinical documents which can be characterized with co-occurring medical terms, various expressions for the same concepts, abbreviations, and typographical errors showed increased improvement with regards to a correlation between co-occurring terms and document similarities. CONCLUSIONS: Our results showed that LSI can be used effectively to measure similarities in clinical documents. In addition, correlation between the co-occurrence of terms and similarities realized in this study is an important positive feature associated with LSI. |
format | Text |
id | pubmed-3092990 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Korean Society of Medical Informatics |
record_format | MEDLINE/PubMed |
spelling | pubmed-30929902011-07-13 Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing Han, Choonghyun Yoo, Sooyoung Choi, Jinwook Healthc Inform Res Original Article OBJECTIVES: Measurement of similarities between documents is typically influenced by the sparseness of the term-document matrix employed. Latent semantic indexing (LSI) may improve the results of this type of analysis. METHODS: In this study, LSI was utilized in an attempt to reduce the term vector space of clinical documents and newspaper editorials. RESULTS: After applying LSI, document similarities were revealed more clearly in clinical documents than editorials. Clinical documents which can be characterized with co-occurring medical terms, various expressions for the same concepts, abbreviations, and typographical errors showed increased improvement with regards to a correlation between co-occurring terms and document similarities. CONCLUSIONS: Our results showed that LSI can be used effectively to measure similarities in clinical documents. In addition, correlation between the co-occurrence of terms and similarities realized in this study is an important positive feature associated with LSI. Korean Society of Medical Informatics 2011-03 2011-03-31 /pmc/articles/PMC3092990/ /pubmed/21818454 http://dx.doi.org/10.4258/hir.2011.17.1.24 Text en © 2011 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Han, Choonghyun Yoo, Sooyoung Choi, Jinwook Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing |
title | Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing |
title_full | Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing |
title_fullStr | Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing |
title_full_unstemmed | Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing |
title_short | Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing |
title_sort | evaluation of co-occurring terms in clinical documents using latent semantic indexing |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3092990/ https://www.ncbi.nlm.nih.gov/pubmed/21818454 http://dx.doi.org/10.4258/hir.2011.17.1.24 |
work_keys_str_mv | AT hanchoonghyun evaluationofcooccurringtermsinclinicaldocumentsusinglatentsemanticindexing AT yoosooyoung evaluationofcooccurringtermsinclinicaldocumentsusinglatentsemanticindexing AT choijinwook evaluationofcooccurringtermsinclinicaldocumentsusinglatentsemanticindexing |