Cargando…

Deciphering clinical abbreviations with a privacy protecting machine learning system

Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajkomar, Alvin, Loreaux, Eric, Liu, Yuchen, Kemp, Jonas, Li, Benny, Chen, Ming-Jun, Zhang, Yi, Mohiuddin, Afroz, Gottweis, Juraj
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718734/
https://www.ncbi.nlm.nih.gov/pubmed/36460656
http://dx.doi.org/10.1038/s41467-022-35007-9
_version_ 1784843156506804224
author Rajkomar, Alvin
Loreaux, Eric
Liu, Yuchen
Kemp, Jonas
Li, Benny
Chen, Ming-Jun
Zhang, Yi
Mohiuddin, Afroz
Gottweis, Juraj
author_facet Rajkomar, Alvin
Loreaux, Eric
Liu, Yuchen
Kemp, Jonas
Li, Benny
Chen, Ming-Jun
Zhang, Yi
Mohiuddin, Afroz
Gottweis, Juraj
author_sort Rajkomar, Alvin
collection PubMed
description Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”), or domain-specific vernacular (“cb” for “complicated by”). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data.
format Online
Article
Text
id pubmed-9718734
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-97187342022-12-04 Deciphering clinical abbreviations with a privacy protecting machine learning system Rajkomar, Alvin Loreaux, Eric Liu, Yuchen Kemp, Jonas Li, Benny Chen, Ming-Jun Zhang, Yi Mohiuddin, Afroz Gottweis, Juraj Nat Commun Article Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”), or domain-specific vernacular (“cb” for “complicated by”). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data. Nature Publishing Group UK 2022-12-02 /pmc/articles/PMC9718734/ /pubmed/36460656 http://dx.doi.org/10.1038/s41467-022-35007-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Rajkomar, Alvin
Loreaux, Eric
Liu, Yuchen
Kemp, Jonas
Li, Benny
Chen, Ming-Jun
Zhang, Yi
Mohiuddin, Afroz
Gottweis, Juraj
Deciphering clinical abbreviations with a privacy protecting machine learning system
title Deciphering clinical abbreviations with a privacy protecting machine learning system
title_full Deciphering clinical abbreviations with a privacy protecting machine learning system
title_fullStr Deciphering clinical abbreviations with a privacy protecting machine learning system
title_full_unstemmed Deciphering clinical abbreviations with a privacy protecting machine learning system
title_short Deciphering clinical abbreviations with a privacy protecting machine learning system
title_sort deciphering clinical abbreviations with a privacy protecting machine learning system
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718734/
https://www.ncbi.nlm.nih.gov/pubmed/36460656
http://dx.doi.org/10.1038/s41467-022-35007-9
work_keys_str_mv AT rajkomaralvin decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT loreauxeric decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT liuyuchen decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT kempjonas decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT libenny decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT chenmingjun decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT zhangyi decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT mohiuddinafroz decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem
AT gottweisjuraj decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem