Cargando…

PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction

BACKGROUND: Similarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases. Clinical Terms in EHRs represent high-level conceptual information and th...

Descripción completa

Detalles Bibliográficos
Autores principales: Chan, Lawrence WC, Liu, Ying, Chan, Tao, Law, Helen KW, Wong, SC Cesar, Yeung, Andy PH, Lo, KF, Yeung, SW, Kwok, KY, Chan, William YL, Lau, Thomas YH, Shyu, Chi-Ren
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4450834/
https://www.ncbi.nlm.nih.gov/pubmed/26032596
http://dx.doi.org/10.1186/s12911-015-0166-2
_version_ 1782374058161078272
author Chan, Lawrence WC
Liu, Ying
Chan, Tao
Law, Helen KW
Wong, SC Cesar
Yeung, Andy PH
Lo, KF
Yeung, SW
Kwok, KY
Chan, William YL
Lau, Thomas YH
Shyu, Chi-Ren
author_facet Chan, Lawrence WC
Liu, Ying
Chan, Tao
Law, Helen KW
Wong, SC Cesar
Yeung, Andy PH
Lo, KF
Yeung, SW
Kwok, KY
Chan, William YL
Lau, Thomas YH
Shyu, Chi-Ren
author_sort Chan, Lawrence WC
collection PubMed
description BACKGROUND: Similarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases. Clinical Terms in EHRs represent high-level conceptual information and the similarity measure established based on these terms reflects the chance of inter-patient disease co-occurrence. The assumption that clinical terms are equally relevant to a disease is unrealistic, reducing the prediction accuracy. Here we propose a term weighting approach supported by PubMed search engine to address this issue. METHODS: We collected and studied 112 abdominal computed tomography imaging examination reports from four hospitals in Hong Kong. Clinical terms, which are the image findings related to hepatocellular carcinoma (HCC), were extracted from the reports. Through two systematic PubMed search methods, the generic and specific term weightings were established by estimating the conditional probabilities of clinical terms given HCC. Each report was characterized by an ontological feature vector and there were totally 6216 vector pairs. We optimized the modified direction cosine (mDC) with respect to a regularization constant embedded into the feature vector. Equal, generic and specific term weighting approaches were applied to measure the similarity of each pair and their performances for predicting inter-patient co-occurrence of HCC diagnoses were compared by using Receiver Operating Characteristics (ROC) analysis. RESULTS: The Areas under the curves (AUROCs) of similarity scores based on equal, generic and specific term weighting approaches were 0.735, 0.728 and 0.743 respectively (p < 0.01). In comparison with equal term weighting, the performance was significantly improved by specific term weighting (p < 0.01) but not by generic term weighting. The clinical terms “Dysplastic nodule”, “nodule of liver” and “equal density (isodense) lesion” were found the top three image findings associated with HCC in PubMed. CONCLUSIONS: Our findings suggest that the optimized similarity measure with specific term weighting to EHRs can improve significantly the accuracy for predicting the inter-patient co-occurrence of diagnosis when compared with equal and generic term weighting approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12911-015-0166-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4450834
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44508342015-06-02 PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction Chan, Lawrence WC Liu, Ying Chan, Tao Law, Helen KW Wong, SC Cesar Yeung, Andy PH Lo, KF Yeung, SW Kwok, KY Chan, William YL Lau, Thomas YH Shyu, Chi-Ren BMC Med Inform Decis Mak Research Article BACKGROUND: Similarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases. Clinical Terms in EHRs represent high-level conceptual information and the similarity measure established based on these terms reflects the chance of inter-patient disease co-occurrence. The assumption that clinical terms are equally relevant to a disease is unrealistic, reducing the prediction accuracy. Here we propose a term weighting approach supported by PubMed search engine to address this issue. METHODS: We collected and studied 112 abdominal computed tomography imaging examination reports from four hospitals in Hong Kong. Clinical terms, which are the image findings related to hepatocellular carcinoma (HCC), were extracted from the reports. Through two systematic PubMed search methods, the generic and specific term weightings were established by estimating the conditional probabilities of clinical terms given HCC. Each report was characterized by an ontological feature vector and there were totally 6216 vector pairs. We optimized the modified direction cosine (mDC) with respect to a regularization constant embedded into the feature vector. Equal, generic and specific term weighting approaches were applied to measure the similarity of each pair and their performances for predicting inter-patient co-occurrence of HCC diagnoses were compared by using Receiver Operating Characteristics (ROC) analysis. RESULTS: The Areas under the curves (AUROCs) of similarity scores based on equal, generic and specific term weighting approaches were 0.735, 0.728 and 0.743 respectively (p < 0.01). In comparison with equal term weighting, the performance was significantly improved by specific term weighting (p < 0.01) but not by generic term weighting. The clinical terms “Dysplastic nodule”, “nodule of liver” and “equal density (isodense) lesion” were found the top three image findings associated with HCC in PubMed. CONCLUSIONS: Our findings suggest that the optimized similarity measure with specific term weighting to EHRs can improve significantly the accuracy for predicting the inter-patient co-occurrence of diagnosis when compared with equal and generic term weighting approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12911-015-0166-2) contains supplementary material, which is available to authorized users. BioMed Central 2015-06-02 /pmc/articles/PMC4450834/ /pubmed/26032596 http://dx.doi.org/10.1186/s12911-015-0166-2 Text en © Chan et al.; licensee BioMed Central. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Chan, Lawrence WC
Liu, Ying
Chan, Tao
Law, Helen KW
Wong, SC Cesar
Yeung, Andy PH
Lo, KF
Yeung, SW
Kwok, KY
Chan, William YL
Lau, Thomas YH
Shyu, Chi-Ren
PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
title PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
title_full PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
title_fullStr PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
title_full_unstemmed PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
title_short PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
title_sort pubmed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4450834/
https://www.ncbi.nlm.nih.gov/pubmed/26032596
http://dx.doi.org/10.1186/s12911-015-0166-2
work_keys_str_mv AT chanlawrencewc pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT liuying pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT chantao pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT lawhelenkw pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT wongsccesar pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT yeungandyph pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT lokf pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT yeungsw pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT kwokky pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT chanwilliamyl pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT lauthomasyh pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction
AT shyuchiren pubmedsupportedclinicaltermweightingapproachforimprovinginterpatientsimilaritymeasureindiagnosisprediction