Cargando…
Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers
We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English s...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8391184/ https://www.ncbi.nlm.nih.gov/pubmed/34444538 http://dx.doi.org/10.3390/ijerph18168789 |
_version_ | 1783743211386175488 |
---|---|
author | Xie, Wenxiu Ji, Meng Huang, Riliu Hao, Tianyong Chow, Chi-Yin |
author_facet | Xie, Wenxiu Ji, Meng Huang, Riliu Hao, Tianyong Chow, Chi-Yin |
author_sort | Xie, Wenxiu |
collection | PubMed |
description | We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English source information as input to the MT systems. A MNB classifier was developed to provide intuitive probabilistic predictions of erroneous health translation outputs based on the computational modelling of a small number of optimised features of the original English source texts. The best performing multinominal Naïve Bayes classifier (MNB) using a small number of optimised features (8) achieved statistically higher AUC (M = 0.760, SD = 0.03) than the classifier using high-dimension natural features (135) (M = 0.631, SD = 0.006, p < 0.0001, SE = 0.004) and the automatically optimised classifier (22) (M = 0.7231, SD = 0.0084, p < 0.0001, SE = 0.004). Furthermore, MNB (8) had statistically higher sensitivity (M = 0.885, SD = 0.100) compared with the full-feature classifier (135) (M = 0.577, SD = 0.155, p < 0.0001, SE = 0.005) and the automatically optimised classifier (22) (M = 0.731, SD = 0.139, p < 0.0001, SE = 0.0023). Finally, MNB (8) reached statistically higher specificity (M = 0.667, SD = 0.138) compared to the full-feature classifier (135) (M = 0.567, SD = 0.139, p = 0.0002, SE = 0.026) and the automatically optimised classifier (22) (M = 0.633, SD = 0.141, p = 0.0133, SE = 0.026). |
format | Online Article Text |
id | pubmed-8391184 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83911842021-08-28 Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers Xie, Wenxiu Ji, Meng Huang, Riliu Hao, Tianyong Chow, Chi-Yin Int J Environ Res Public Health Article We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English source information as input to the MT systems. A MNB classifier was developed to provide intuitive probabilistic predictions of erroneous health translation outputs based on the computational modelling of a small number of optimised features of the original English source texts. The best performing multinominal Naïve Bayes classifier (MNB) using a small number of optimised features (8) achieved statistically higher AUC (M = 0.760, SD = 0.03) than the classifier using high-dimension natural features (135) (M = 0.631, SD = 0.006, p < 0.0001, SE = 0.004) and the automatically optimised classifier (22) (M = 0.7231, SD = 0.0084, p < 0.0001, SE = 0.004). Furthermore, MNB (8) had statistically higher sensitivity (M = 0.885, SD = 0.100) compared with the full-feature classifier (135) (M = 0.577, SD = 0.155, p < 0.0001, SE = 0.005) and the automatically optimised classifier (22) (M = 0.731, SD = 0.139, p < 0.0001, SE = 0.0023). Finally, MNB (8) reached statistically higher specificity (M = 0.667, SD = 0.138) compared to the full-feature classifier (135) (M = 0.567, SD = 0.139, p = 0.0002, SE = 0.026) and the automatically optimised classifier (22) (M = 0.633, SD = 0.141, p = 0.0133, SE = 0.026). MDPI 2021-08-20 /pmc/articles/PMC8391184/ /pubmed/34444538 http://dx.doi.org/10.3390/ijerph18168789 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Xie, Wenxiu Ji, Meng Huang, Riliu Hao, Tianyong Chow, Chi-Yin Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers |
title | Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers |
title_full | Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers |
title_fullStr | Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers |
title_full_unstemmed | Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers |
title_short | Predicting Risks of Machine Translations of Public Health Resources by Developing Interpretable Machine Learning Classifiers |
title_sort | predicting risks of machine translations of public health resources by developing interpretable machine learning classifiers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8391184/ https://www.ncbi.nlm.nih.gov/pubmed/34444538 http://dx.doi.org/10.3390/ijerph18168789 |
work_keys_str_mv | AT xiewenxiu predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers AT jimeng predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers AT huangriliu predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers AT haotianyong predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers AT chowchiyin predictingrisksofmachinetranslationsofpublichealthresourcesbydevelopinginterpretablemachinelearningclassifiers |