Cargando…

Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing

BACKGROUND: Modern patient electronic health records form a core part of primary care; they contain both clinical codes and free text entered by the clinician. Natural language processing (NLP) could be employed to generate these records through ‘listening’ to a consultation conversation. OBJECTIVES...

Descripción completa

Detalles Bibliográficos
Autores principales: Pyne, Yvette, Wong, Yik Ming, Fang, Haishuo, Simpson, Edwin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10151863/
https://www.ncbi.nlm.nih.gov/pubmed/37116948
http://dx.doi.org/10.1136/bmjhci-2022-100659
_version_ 1785035633163501568
author Pyne, Yvette
Wong, Yik Ming
Fang, Haishuo
Simpson, Edwin
author_facet Pyne, Yvette
Wong, Yik Ming
Fang, Haishuo
Simpson, Edwin
author_sort Pyne, Yvette
collection PubMed
description BACKGROUND: Modern patient electronic health records form a core part of primary care; they contain both clinical codes and free text entered by the clinician. Natural language processing (NLP) could be employed to generate these records through ‘listening’ to a consultation conversation. OBJECTIVES: This study develops and assesses several text classifiers for identifying clinical codes for primary care consultations based on the doctor–patient conversation. We evaluate the possibility of training classifiers using medical code descriptions, and the benefits of processing transcribed speech from patients as well as doctors. The study also highlights steps for improving future classifiers. METHODS: Using verbatim transcripts of 239 primary care consultation conversations (the ‘One in a Million’ dataset) and novel additional datasets for distant supervision, we trained NLP classifiers (naïve Bayes, support vector machine, nearest centroid, a conventional BERT classifier and few-shot BERT approaches) to identify the International Classification of Primary Care-2 clinical codes associated with each consultation. RESULTS: Of all models tested, a fine-tuned BERT classifier was the best performer. Distant supervision improved the model’s performance (F1 score over 16 classes) from 0.45 with conventional supervision with 191 labelled transcripts to 0.51. Incorporating patients’ speech in addition to clinician’s speech increased the BERT classifier’s performance from 0.45 to 0.55 F1 (p=0.01, paired bootstrap test). CONCLUSIONS: Our findings demonstrate that NLP classifiers can be trained to identify clinical area(s) being discussed in a primary care consultation from audio transcriptions; this could represent an important step towards a smart digital assistant in the consultation room.
format Online
Article
Text
id pubmed-10151863
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-101518632023-05-03 Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing Pyne, Yvette Wong, Yik Ming Fang, Haishuo Simpson, Edwin BMJ Health Care Inform Original Research BACKGROUND: Modern patient electronic health records form a core part of primary care; they contain both clinical codes and free text entered by the clinician. Natural language processing (NLP) could be employed to generate these records through ‘listening’ to a consultation conversation. OBJECTIVES: This study develops and assesses several text classifiers for identifying clinical codes for primary care consultations based on the doctor–patient conversation. We evaluate the possibility of training classifiers using medical code descriptions, and the benefits of processing transcribed speech from patients as well as doctors. The study also highlights steps for improving future classifiers. METHODS: Using verbatim transcripts of 239 primary care consultation conversations (the ‘One in a Million’ dataset) and novel additional datasets for distant supervision, we trained NLP classifiers (naïve Bayes, support vector machine, nearest centroid, a conventional BERT classifier and few-shot BERT approaches) to identify the International Classification of Primary Care-2 clinical codes associated with each consultation. RESULTS: Of all models tested, a fine-tuned BERT classifier was the best performer. Distant supervision improved the model’s performance (F1 score over 16 classes) from 0.45 with conventional supervision with 191 labelled transcripts to 0.51. Incorporating patients’ speech in addition to clinician’s speech increased the BERT classifier’s performance from 0.45 to 0.55 F1 (p=0.01, paired bootstrap test). CONCLUSIONS: Our findings demonstrate that NLP classifiers can be trained to identify clinical area(s) being discussed in a primary care consultation from audio transcriptions; this could represent an important step towards a smart digital assistant in the consultation room. BMJ Publishing Group 2023-04-28 /pmc/articles/PMC10151863/ /pubmed/37116948 http://dx.doi.org/10.1136/bmjhci-2022-100659 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
spellingShingle Original Research
Pyne, Yvette
Wong, Yik Ming
Fang, Haishuo
Simpson, Edwin
Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
title Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
title_full Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
title_fullStr Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
title_full_unstemmed Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
title_short Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
title_sort analysis of ‘one in a million’ primary care consultation conversations using natural language processing
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10151863/
https://www.ncbi.nlm.nih.gov/pubmed/37116948
http://dx.doi.org/10.1136/bmjhci-2022-100659
work_keys_str_mv AT pyneyvette analysisofoneinamillionprimarycareconsultationconversationsusingnaturallanguageprocessing
AT wongyikming analysisofoneinamillionprimarycareconsultationconversationsusingnaturallanguageprocessing
AT fanghaishuo analysisofoneinamillionprimarycareconsultationconversationsusingnaturallanguageprocessing
AT simpsonedwin analysisofoneinamillionprimarycareconsultationconversationsusingnaturallanguageprocessing