Cargando…
Deciphering clinical abbreviations with a privacy protecting machine learning system
Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718734/ https://www.ncbi.nlm.nih.gov/pubmed/36460656 http://dx.doi.org/10.1038/s41467-022-35007-9 |
_version_ | 1784843156506804224 |
---|---|
author | Rajkomar, Alvin Loreaux, Eric Liu, Yuchen Kemp, Jonas Li, Benny Chen, Ming-Jun Zhang, Yi Mohiuddin, Afroz Gottweis, Juraj |
author_facet | Rajkomar, Alvin Loreaux, Eric Liu, Yuchen Kemp, Jonas Li, Benny Chen, Ming-Jun Zhang, Yi Mohiuddin, Afroz Gottweis, Juraj |
author_sort | Rajkomar, Alvin |
collection | PubMed |
description | Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”), or domain-specific vernacular (“cb” for “complicated by”). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data. |
format | Online Article Text |
id | pubmed-9718734 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-97187342022-12-04 Deciphering clinical abbreviations with a privacy protecting machine learning system Rajkomar, Alvin Loreaux, Eric Liu, Yuchen Kemp, Jonas Li, Benny Chen, Ming-Jun Zhang, Yi Mohiuddin, Afroz Gottweis, Juraj Nat Commun Article Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”), or domain-specific vernacular (“cb” for “complicated by”). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data. Nature Publishing Group UK 2022-12-02 /pmc/articles/PMC9718734/ /pubmed/36460656 http://dx.doi.org/10.1038/s41467-022-35007-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Rajkomar, Alvin Loreaux, Eric Liu, Yuchen Kemp, Jonas Li, Benny Chen, Ming-Jun Zhang, Yi Mohiuddin, Afroz Gottweis, Juraj Deciphering clinical abbreviations with a privacy protecting machine learning system |
title | Deciphering clinical abbreviations with a privacy protecting machine learning system |
title_full | Deciphering clinical abbreviations with a privacy protecting machine learning system |
title_fullStr | Deciphering clinical abbreviations with a privacy protecting machine learning system |
title_full_unstemmed | Deciphering clinical abbreviations with a privacy protecting machine learning system |
title_short | Deciphering clinical abbreviations with a privacy protecting machine learning system |
title_sort | deciphering clinical abbreviations with a privacy protecting machine learning system |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718734/ https://www.ncbi.nlm.nih.gov/pubmed/36460656 http://dx.doi.org/10.1038/s41467-022-35007-9 |
work_keys_str_mv | AT rajkomaralvin decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT loreauxeric decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT liuyuchen decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT kempjonas decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT libenny decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT chenmingjun decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT zhangyi decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT mohiuddinafroz decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem AT gottweisjuraj decipheringclinicalabbreviationswithaprivacyprotectingmachinelearningsystem |