Cargando…

Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development

BACKGROUND: Linguistic accessibility has an important impact on the reception and utilization of translated health resources among multicultural and multilingual populations. Linguistic understandability of health translation has been understudied. OBJECTIVE: Our study aimed to develop novel machine...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ji, Meng, Bouillon, Pierrette
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2021
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532010/ https://www.ncbi.nlm.nih.gov/pubmed/34617914 http://dx.doi.org/10.2196/30588

_version_	1784586986922704896
author	Ji, Meng Bouillon, Pierrette
author_facet	Ji, Meng Bouillon, Pierrette
author_sort	Ji, Meng
collection	PubMed
description	BACKGROUND: Linguistic accessibility has an important impact on the reception and utilization of translated health resources among multicultural and multilingual populations. Linguistic understandability of health translation has been understudied. OBJECTIVE: Our study aimed to develop novel machine learning models for the study of the linguistic accessibility of health translations comparing Chinese translations of the World Health Organization health materials with original Chinese health resources developed by the Chinese health authorities. METHODS: Using natural language processing tools for the assessment of the readability of Chinese materials, we explored and compared the readability of Chinese health translations from the World Health Organization with original Chinese materials from the China Center for Disease Control and Prevention. RESULTS: A pairwise adjusted t test showed that the following 3 new machine learning models achieved statistically significant improvement over the baseline logistic regression in terms of area under the curve: C5.0 decision tree (95% CI –0.249 to –0.152; P<0.001), random forest (95% CI 0.139-0.239; P<0.001) and extreme gradient boosting tree (95% CI 0.099-0.193; P<0.001). There was, however, no significant difference between C5.0 decision tree and random forest (P=0.513). The extreme gradient boosting tree was the best model, achieving statistically significant improvement over the C5.0 model (P=0.003) and the random forest model (P=0.006) at an adjusted Bonferroni P value at 0.008. CONCLUSIONS: The development of machine learning algorithms significantly improved the accuracy and reliability of current approaches to the evaluation of the linguistic accessibility of Chinese health information, especially Chinese health translations in relation to original health resources. Although the new algorithms developed were based on Chinese health resources, they can be adapted for other languages to advance current research in accessible health translation, communication, and promotion.
format	Online Article Text
id	pubmed-8532010
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-85320102021-11-09 Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development Ji, Meng Bouillon, Pierrette JMIR Med Inform Original Paper BACKGROUND: Linguistic accessibility has an important impact on the reception and utilization of translated health resources among multicultural and multilingual populations. Linguistic understandability of health translation has been understudied. OBJECTIVE: Our study aimed to develop novel machine learning models for the study of the linguistic accessibility of health translations comparing Chinese translations of the World Health Organization health materials with original Chinese health resources developed by the Chinese health authorities. METHODS: Using natural language processing tools for the assessment of the readability of Chinese materials, we explored and compared the readability of Chinese health translations from the World Health Organization with original Chinese materials from the China Center for Disease Control and Prevention. RESULTS: A pairwise adjusted t test showed that the following 3 new machine learning models achieved statistically significant improvement over the baseline logistic regression in terms of area under the curve: C5.0 decision tree (95% CI –0.249 to –0.152; P<0.001), random forest (95% CI 0.139-0.239; P<0.001) and extreme gradient boosting tree (95% CI 0.099-0.193; P<0.001). There was, however, no significant difference between C5.0 decision tree and random forest (P=0.513). The extreme gradient boosting tree was the best model, achieving statistically significant improvement over the C5.0 model (P=0.003) and the random forest model (P=0.006) at an adjusted Bonferroni P value at 0.008. CONCLUSIONS: The development of machine learning algorithms significantly improved the accuracy and reliability of current approaches to the evaluation of the linguistic accessibility of Chinese health information, especially Chinese health translations in relation to original health resources. Although the new algorithms developed were based on Chinese health resources, they can be adapted for other languages to advance current research in accessible health translation, communication, and promotion. JMIR Publications 2021-10-07 /pmc/articles/PMC8532010/ /pubmed/34617914 http://dx.doi.org/10.2196/30588 Text en ©Meng Ji, Pierrette Bouillon. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 07.10.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Ji, Meng Bouillon, Pierrette Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development
title	Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development
title_full	Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development
title_fullStr	Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development
title_full_unstemmed	Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development
title_short	Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development
title_sort	predicting the linguistic accessibility of chinese health translations: machine learning algorithm development
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532010/ https://www.ncbi.nlm.nih.gov/pubmed/34617914 http://dx.doi.org/10.2196/30588
work_keys_str_mv	AT jimeng predictingthelinguisticaccessibilityofchinesehealthtranslationsmachinelearningalgorithmdevelopment AT bouillonpierrette predictingthelinguisticaccessibilityofchinesehealthtranslationsmachinelearningalgorithmdevelopment

Predicting the Linguistic Accessibility of Chinese Health Translations: Machine Learning Algorithm Development

Ejemplares similares