Cargando…
Query bot for retrieving patients’ clinical history: A COVID-19 use-case
OBJECTIVE: With increasing patient complexity whose data are stored in fragmented health information systems, automated and time-efficient ways of gathering important information from the patients' medical history are needed for effective clinical decision making. Using COVID-19 as a case study...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8454191/ https://www.ncbi.nlm.nih.gov/pubmed/34560275 http://dx.doi.org/10.1016/j.jbi.2021.103918 |
_version_ | 1784570438431539200 |
---|---|
author | Wang, Yibo Tariq, Amara Khan, Fiza Gichoya, Judy Wawira Trivedi, Hari Banerjee, Imon |
author_facet | Wang, Yibo Tariq, Amara Khan, Fiza Gichoya, Judy Wawira Trivedi, Hari Banerjee, Imon |
author_sort | Wang, Yibo |
collection | PubMed |
description | OBJECTIVE: With increasing patient complexity whose data are stored in fragmented health information systems, automated and time-efficient ways of gathering important information from the patients' medical history are needed for effective clinical decision making. Using COVID-19 as a case study, we developed a query-bot information retrieval system with user-feedback to allow clinicians to ask natural questions to retrieve data from patient notes. MATERIALS AND METHODS: We applied clinicalBERT, a pre-trained contextual language model, to our dataset of patient notes to obtain sentence embeddings, using K-Means to reduce computation time for real-time interaction. Rocchio algorithm was then employed to incorporate user-feedback and improve retrieval performance. RESULTS: In an iterative feedback loop experiment, MAP for final iteration was 0.93/0.94 as compared to initial MAP of 0.66/0.52 for generic and 1./1. compared to 0.79/0.83 for COVID-19 specific queries confirming that contextual model handles the ambiguity in natural language queries and feedback helps to improve retrieval performance. User-in-loop experiment also outperformed the automated pseudo relevance feedback method. Moreover, the null hypothesis which assumes identical precision between initial retrieval and relevance feedback was rejected with high statistical significance (p ≪ 0.05). Compared to Word2Vec, TF-IDF and bioBERT models, clinicalBERT works optimally considering the balance between response precision and user-feedback. DISCUSSION: Our model works well for generic as well as COVID-19 specific queries. However, some generic queries are not answered as well as others because clustering reduces query performance and vague relations between queries and sentences are considered non-relevant. We also tested our model for queries with the same meaning but different expressions and demonstrated that these query variations yielded similar performance after incorporation of user-feedback. CONCLUSION: In conclusion, we develop an NLP-based query-bot that handles synonyms and natural language ambiguity in order to retrieve relevant information from the patient chart. User-feedback is critical to improve model performance. |
format | Online Article Text |
id | pubmed-8454191 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-84541912021-09-21 Query bot for retrieving patients’ clinical history: A COVID-19 use-case Wang, Yibo Tariq, Amara Khan, Fiza Gichoya, Judy Wawira Trivedi, Hari Banerjee, Imon J Biomed Inform Original Research OBJECTIVE: With increasing patient complexity whose data are stored in fragmented health information systems, automated and time-efficient ways of gathering important information from the patients' medical history are needed for effective clinical decision making. Using COVID-19 as a case study, we developed a query-bot information retrieval system with user-feedback to allow clinicians to ask natural questions to retrieve data from patient notes. MATERIALS AND METHODS: We applied clinicalBERT, a pre-trained contextual language model, to our dataset of patient notes to obtain sentence embeddings, using K-Means to reduce computation time for real-time interaction. Rocchio algorithm was then employed to incorporate user-feedback and improve retrieval performance. RESULTS: In an iterative feedback loop experiment, MAP for final iteration was 0.93/0.94 as compared to initial MAP of 0.66/0.52 for generic and 1./1. compared to 0.79/0.83 for COVID-19 specific queries confirming that contextual model handles the ambiguity in natural language queries and feedback helps to improve retrieval performance. User-in-loop experiment also outperformed the automated pseudo relevance feedback method. Moreover, the null hypothesis which assumes identical precision between initial retrieval and relevance feedback was rejected with high statistical significance (p ≪ 0.05). Compared to Word2Vec, TF-IDF and bioBERT models, clinicalBERT works optimally considering the balance between response precision and user-feedback. DISCUSSION: Our model works well for generic as well as COVID-19 specific queries. However, some generic queries are not answered as well as others because clustering reduces query performance and vague relations between queries and sentences are considered non-relevant. We also tested our model for queries with the same meaning but different expressions and demonstrated that these query variations yielded similar performance after incorporation of user-feedback. CONCLUSION: In conclusion, we develop an NLP-based query-bot that handles synonyms and natural language ambiguity in order to retrieve relevant information from the patient chart. User-feedback is critical to improve model performance. Elsevier Inc. 2021-11 2021-09-21 /pmc/articles/PMC8454191/ /pubmed/34560275 http://dx.doi.org/10.1016/j.jbi.2021.103918 Text en © 2021 Elsevier Inc. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Original Research Wang, Yibo Tariq, Amara Khan, Fiza Gichoya, Judy Wawira Trivedi, Hari Banerjee, Imon Query bot for retrieving patients’ clinical history: A COVID-19 use-case |
title | Query bot for retrieving patients’ clinical history: A COVID-19 use-case |
title_full | Query bot for retrieving patients’ clinical history: A COVID-19 use-case |
title_fullStr | Query bot for retrieving patients’ clinical history: A COVID-19 use-case |
title_full_unstemmed | Query bot for retrieving patients’ clinical history: A COVID-19 use-case |
title_short | Query bot for retrieving patients’ clinical history: A COVID-19 use-case |
title_sort | query bot for retrieving patients’ clinical history: a covid-19 use-case |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8454191/ https://www.ncbi.nlm.nih.gov/pubmed/34560275 http://dx.doi.org/10.1016/j.jbi.2021.103918 |
work_keys_str_mv | AT wangyibo querybotforretrievingpatientsclinicalhistoryacovid19usecase AT tariqamara querybotforretrievingpatientsclinicalhistoryacovid19usecase AT khanfiza querybotforretrievingpatientsclinicalhistoryacovid19usecase AT gichoyajudywawira querybotforretrievingpatientsclinicalhistoryacovid19usecase AT trivedihari querybotforretrievingpatientsclinicalhistoryacovid19usecase AT banerjeeimon querybotforretrievingpatientsclinicalhistoryacovid19usecase |