Cargando…

Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers

We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English s...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Wenxiu, Ji, Meng, Huang, Riliu, Hao, Tianyong, Chow, Chi-Yin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8391184/
https://www.ncbi.nlm.nih.gov/pubmed/34444538
http://dx.doi.org/10.3390/ijerph18168789
_version_ 1783743211386175488
author Xie, Wenxiu
Ji, Meng
Huang, Riliu
Hao, Tianyong
Chow, Chi-Yin
author_facet Xie, Wenxiu
Ji, Meng
Huang, Riliu
Hao, Tianyong
Chow, Chi-Yin
author_sort Xie, Wenxiu
collection PubMed
description We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English source information as input to the MT systems. A MNB classifier was developed to provide intuitive probabilistic predictions of erroneous health translation outputs based on the computational modelling of a small number of optimised features of the original English source texts. The best performing multinominal Naïve Bayes classifier (MNB) using a small number of optimised features (8) achieved statistically higher AUC (M = 0.760, SD = 0.03) than the classifier using high-dimension natural features (135) (M = 0.631, SD = 0.006, p < 0.0001, SE = 0.004) and the automatically optimised classifier (22) (M = 0.7231, SD = 0.0084, p < 0.0001, SE = 0.004). Furthermore, MNB (8) had statistically higher sensitivity (M = 0.885, SD = 0.100) compared with the full-feature classifier (135) (M = 0.577, SD = 0.155, p < 0.0001, SE = 0.005) and the automatically optimised classifier (22) (M = 0.731, SD = 0.139, p < 0.0001, SE = 0.0023). Finally, MNB (8) reached statistically higher specificity (M = 0.667, SD = 0.138) compared to the full-feature classifier (135) (M = 0.567, SD = 0.139, p = 0.0002, SE = 0.026) and the automatically optimised classifier (22) (M = 0.633, SD = 0.141, p = 0.0133, SE = 0.026).
format Online
Article
Text
id pubmed-8391184
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83911842021-08-28 Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers Xie, Wenxiu Ji, Meng Huang, Riliu Hao, Tianyong Chow, Chi-Yin Int J Environ Res Public Health Article We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English source information as input to the MT systems. A MNB classifier was developed to provide intuitive probabilistic predictions of erroneous health translation outputs based on the computational modelling of a small number of optimised features of the original English source texts. The best performing multinominal Naïve Bayes classifier (MNB) using a small number of optimised features (8) achieved statistically higher AUC (M = 0.760, SD = 0.03) than the classifier using high-dimension natural features (135) (M = 0.631, SD = 0.006, p < 0.0001, SE = 0.004) and the automatically optimised classifier (22) (M = 0.7231, SD = 0.0084, p < 0.0001, SE = 0.004). Furthermore, MNB (8) had statistically higher sensitivity (M = 0.885, SD = 0.100) compared with the full-feature classifier (135) (M = 0.577, SD = 0.155, p < 0.0001, SE = 0.005) and the automatically optimised classifier (22) (M = 0.731, SD = 0.139, p < 0.0001, SE = 0.0023). Finally, MNB (8) reached statistically higher specificity (M = 0.667, SD = 0.138) compared to the full-feature classifier (135) (M = 0.567, SD = 0.139, p = 0.0002, SE = 0.026) and the automatically optimised classifier (22) (M = 0.633, SD = 0.141, p = 0.0133, SE = 0.026). MDPI 2021-08-20 /pmc/articles/PMC8391184/ /pubmed/34444538 http://dx.doi.org/10.3390/ijerph18168789 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xie, Wenxiu
Ji, Meng
Huang, Riliu
Hao, Tianyong
Chow, Chi-Yin
Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
title Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
title_full Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
title_fullStr Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
title_full_unstemmed Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
title_short Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
title_sort predicting risks of machine translations of public health resources by developing interpretable machine learning classifiers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8391184/
https://www.ncbi.nlm.nih.gov/pubmed/34444538
http://dx.doi.org/10.3390/ijerph18168789
work_keys_str_mv AT xiewenxiu predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers
AT jimeng predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers
AT huangriliu predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers
AT haotianyong predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers
AT chowchiyin predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers