Cargando…

Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification

OBJECTIVE: Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation sche...

Descripción completa

Detalles Bibliográficos
Autores principales:	Oleynik, Michel, Kugic, Amila, Kasáč, Zdenko, Kreuzthaler, Markus
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6798565/ https://www.ncbi.nlm.nih.gov/pubmed/31512729 http://dx.doi.org/10.1093/jamia/ocz149

_version_	1783460074660823040
author	Oleynik, Michel Kugic, Amila Kasáč, Zdenko Kreuzthaler, Markus
author_facet	Oleynik, Michel Kugic, Amila Kasáč, Zdenko Kreuzthaler, Markus
author_sort	Oleynik, Michel
collection	PubMed
description	OBJECTIVE: Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation schemes trained on a larger dataset. We sought to evaluate shallow and deep learning text classifiers and the impact of pretrained embeddings in a small clinical dataset. MATERIALS AND METHODS: We participated in the 2018 National NLP Clinical Challenges (n2c2) Shared Task on cohort selection and received an annotated dataset with medical narratives of 202 patients for multilabel binary text classification. We set our baseline to a majority classifier, to which we compared a rule-based classifier and orthogonal machine learning strategies: support vector machines, logistic regression, and long short-term memory neural networks. We evaluated logistic regression and long short-term memory using both self-trained and pretrained BioWordVec word embeddings as input representation schemes. RESULTS: Rule-based classifier showed the highest overall micro F(1) score (0.9100), with which we finished first in the challenge. Shallow machine learning strategies showed lower overall micro F(1) scores, but still higher than deep learning strategies and the baseline. We could not show a difference in classification efficiency between self-trained and pretrained embeddings. DISCUSSION: Clinical context, negation, and value-based criteria hindered shallow machine learning approaches, while deep learning strategies could not capture the term diversity due to the small training dataset. CONCLUSION: Shallow methods for clinical phenotyping can still outperform deep learning methods in small imbalanced data, even when supported by pretrained embeddings.
format	Online Article Text
id	pubmed-6798565
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-67985652019-10-24 Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification Oleynik, Michel Kugic, Amila Kasáč, Zdenko Kreuzthaler, Markus J Am Med Inform Assoc Research and Applications OBJECTIVE: Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation schemes trained on a larger dataset. We sought to evaluate shallow and deep learning text classifiers and the impact of pretrained embeddings in a small clinical dataset. MATERIALS AND METHODS: We participated in the 2018 National NLP Clinical Challenges (n2c2) Shared Task on cohort selection and received an annotated dataset with medical narratives of 202 patients for multilabel binary text classification. We set our baseline to a majority classifier, to which we compared a rule-based classifier and orthogonal machine learning strategies: support vector machines, logistic regression, and long short-term memory neural networks. We evaluated logistic regression and long short-term memory using both self-trained and pretrained BioWordVec word embeddings as input representation schemes. RESULTS: Rule-based classifier showed the highest overall micro F(1) score (0.9100), with which we finished first in the challenge. Shallow machine learning strategies showed lower overall micro F(1) scores, but still higher than deep learning strategies and the baseline. We could not show a difference in classification efficiency between self-trained and pretrained embeddings. DISCUSSION: Clinical context, negation, and value-based criteria hindered shallow machine learning approaches, while deep learning strategies could not capture the term diversity due to the small training dataset. CONCLUSION: Shallow methods for clinical phenotyping can still outperform deep learning methods in small imbalanced data, even when supported by pretrained embeddings. Oxford University Press 2019-09-12 /pmc/articles/PMC6798565/ /pubmed/31512729 http://dx.doi.org/10.1093/jamia/ocz149 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Research and Applications Oleynik, Michel Kugic, Amila Kasáč, Zdenko Kreuzthaler, Markus Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
title	Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
title_full	Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
title_fullStr	Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
title_full_unstemmed	Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
title_short	Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
title_sort	evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6798565/ https://www.ncbi.nlm.nih.gov/pubmed/31512729 http://dx.doi.org/10.1093/jamia/ocz149
work_keys_str_mv	AT oleynikmichel evaluatingshallowanddeeplearningstrategiesforthe2018n2c2sharedtaskonclinicaltextclassification AT kugicamila evaluatingshallowanddeeplearningstrategiesforthe2018n2c2sharedtaskonclinicaltextclassification AT kasaczdenko evaluatingshallowanddeeplearningstrategiesforthe2018n2c2sharedtaskonclinicaltextclassification AT kreuzthalermarkus evaluatingshallowanddeeplearningstrategiesforthe2018n2c2sharedtaskonclinicaltextclassification

Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification

Ejemplares similares