Cargando…
Impact of De-Identification on Clinical Text Classification Using Traditional and Deep Learning Classifiers
Clinical text de-identification enables collaborative research while protecting patient privacy and confidentiality; however, concerns persist about the reduction in the utility of the de-identified text for information extraction and machine learning tasks. In the context of a deep learning experim...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6779034/ https://www.ncbi.nlm.nih.gov/pubmed/31437930 http://dx.doi.org/10.3233/SHTI190228 |
_version_ | 1783456876105564160 |
---|---|
author | Obeid, Jihad S. Heider, Paul M. Weeda, Erin R. Matuskowitz, Andrew J. Carr, Christine M. Gagnon, Kevin Crawford, Tami Meystre, Stephane M. |
author_facet | Obeid, Jihad S. Heider, Paul M. Weeda, Erin R. Matuskowitz, Andrew J. Carr, Christine M. Gagnon, Kevin Crawford, Tami Meystre, Stephane M. |
author_sort | Obeid, Jihad S. |
collection | PubMed |
description | Clinical text de-identification enables collaborative research while protecting patient privacy and confidentiality; however, concerns persist about the reduction in the utility of the de-identified text for information extraction and machine learning tasks. In the context of a deep learning experiment to detect altered mental status in emergency department provider notes, we tested several classifiers on clinical notes in their original form and on their automatically de-identified counterpart. We tested both traditional bag-of-words based machine learning models as well as word-embedding based deep learning models. We evaluated the models on 1,113 history of present illness notes. A total of 1,795 protected health information tokens were replaced in the de-identification process across all notes. The deep learning models had the best performance with accuracies of 95% on both original and de-identified notes. However, there was no significant difference in the performance of any of the models on the original vs. the de-identified notes. |
format | Online Article Text |
id | pubmed-6779034 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
record_format | MEDLINE/PubMed |
spelling | pubmed-67790342019-10-07 Impact of De-Identification on Clinical Text Classification Using Traditional and Deep Learning Classifiers Obeid, Jihad S. Heider, Paul M. Weeda, Erin R. Matuskowitz, Andrew J. Carr, Christine M. Gagnon, Kevin Crawford, Tami Meystre, Stephane M. Stud Health Technol Inform Article Clinical text de-identification enables collaborative research while protecting patient privacy and confidentiality; however, concerns persist about the reduction in the utility of the de-identified text for information extraction and machine learning tasks. In the context of a deep learning experiment to detect altered mental status in emergency department provider notes, we tested several classifiers on clinical notes in their original form and on their automatically de-identified counterpart. We tested both traditional bag-of-words based machine learning models as well as word-embedding based deep learning models. We evaluated the models on 1,113 history of present illness notes. A total of 1,795 protected health information tokens were replaced in the de-identification process across all notes. The deep learning models had the best performance with accuracies of 95% on both original and de-identified notes. However, there was no significant difference in the performance of any of the models on the original vs. the de-identified notes. 2019-08-21 /pmc/articles/PMC6779034/ /pubmed/31437930 http://dx.doi.org/10.3233/SHTI190228 Text en http://creativecommons.org/licenses/by-nc/4.0/ This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0). |
spellingShingle | Article Obeid, Jihad S. Heider, Paul M. Weeda, Erin R. Matuskowitz, Andrew J. Carr, Christine M. Gagnon, Kevin Crawford, Tami Meystre, Stephane M. Impact of De-Identification on Clinical Text Classification Using Traditional and Deep Learning Classifiers |
title | Impact of De-Identification on Clinical Text Classification Using
Traditional and Deep Learning Classifiers |
title_full | Impact of De-Identification on Clinical Text Classification Using
Traditional and Deep Learning Classifiers |
title_fullStr | Impact of De-Identification on Clinical Text Classification Using
Traditional and Deep Learning Classifiers |
title_full_unstemmed | Impact of De-Identification on Clinical Text Classification Using
Traditional and Deep Learning Classifiers |
title_short | Impact of De-Identification on Clinical Text Classification Using
Traditional and Deep Learning Classifiers |
title_sort | impact of de-identification on clinical text classification using
traditional and deep learning classifiers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6779034/ https://www.ncbi.nlm.nih.gov/pubmed/31437930 http://dx.doi.org/10.3233/SHTI190228 |
work_keys_str_mv | AT obeidjihads impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT heiderpaulm impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT weedaerinr impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT matuskowitzandrewj impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT carrchristinem impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT gagnonkevin impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT crawfordtami impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers AT meystrestephanem impactofdeidentificationonclinicaltextclassificationusingtraditionalanddeeplearningclassifiers |