Cargando…

Data extraction from machine-translated versus original language randomized trial reports: a comparative study

BACKGROUND: Google Translate offers free Web-based translation, but it is unknown whether its translation accuracy is sufficient to use in systematic reviews to mitigate concerns about language bias. METHODS: We compared data extraction from non-English language studies with extraction from translat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Balk, Ethan M, Chung, Mei, Chen, Minghua L, Chang, Lina Kong Win, Trikalinos, Thomas A
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4226266/ https://www.ncbi.nlm.nih.gov/pubmed/24199894 http://dx.doi.org/10.1186/2046-4053-2-97

_version_	1782343607336828928
author	Balk, Ethan M Chung, Mei Chen, Minghua L Chang, Lina Kong Win Trikalinos, Thomas A
author_facet	Balk, Ethan M Chung, Mei Chen, Minghua L Chang, Lina Kong Win Trikalinos, Thomas A
author_sort	Balk, Ethan M
collection	PubMed
description	BACKGROUND: Google Translate offers free Web-based translation, but it is unknown whether its translation accuracy is sufficient to use in systematic reviews to mitigate concerns about language bias. METHODS: We compared data extraction from non-English language studies with extraction from translations by Google Translate of 10 studies in each of five languages (Chinese, French, German, Japanese and Spanish). Fluent speakers double-extracted original-language articles. Researchers who did not speak the given language double-extracted translated articles along with 10 additional English language trials. Using the original language extractions as a gold standard, we estimated the probability and odds ratio of correctly extracting items from translated articles compared with English, adjusting for reviewer and language. RESULTS: Translation required about 30 minutes per article and extraction of translated articles required additional extraction time. The likelihood of correct extractions was greater for study design and intervention domain items than for outcome descriptions and, particularly, study results. Translated Spanish articles yielded the highest percentage of items (93%) that were correctly extracted more than half the time (followed by German and Japanese 89%, French 85%, and Chinese 78%) but Chinese articles yielded the highest percentage of items (41%) that were correctly extracted >98% of the time (followed by Spanish 30%, French 26%, German 22%, and Japanese 19%). In general, extractors’ confidence in translations was not associated with their accuracy. CONCLUSIONS: Translation by Google Translate generally required few resources. Based on our analysis of translations from five languages, using machine translation has the potential to reduce language bias in systematic reviews; however, pending additional empirical data, reviewers should be cautious about using translated data. There remains a trade-off between completeness of systematic reviews (including all available studies) and risk of error (due to poor translation).
format	Online Article Text
id	pubmed-4226266
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-42262662014-11-11 Data extraction from machine-translated versus original language randomized trial reports: a comparative study Balk, Ethan M Chung, Mei Chen, Minghua L Chang, Lina Kong Win Trikalinos, Thomas A Syst Rev Methodology BACKGROUND: Google Translate offers free Web-based translation, but it is unknown whether its translation accuracy is sufficient to use in systematic reviews to mitigate concerns about language bias. METHODS: We compared data extraction from non-English language studies with extraction from translations by Google Translate of 10 studies in each of five languages (Chinese, French, German, Japanese and Spanish). Fluent speakers double-extracted original-language articles. Researchers who did not speak the given language double-extracted translated articles along with 10 additional English language trials. Using the original language extractions as a gold standard, we estimated the probability and odds ratio of correctly extracting items from translated articles compared with English, adjusting for reviewer and language. RESULTS: Translation required about 30 minutes per article and extraction of translated articles required additional extraction time. The likelihood of correct extractions was greater for study design and intervention domain items than for outcome descriptions and, particularly, study results. Translated Spanish articles yielded the highest percentage of items (93%) that were correctly extracted more than half the time (followed by German and Japanese 89%, French 85%, and Chinese 78%) but Chinese articles yielded the highest percentage of items (41%) that were correctly extracted >98% of the time (followed by Spanish 30%, French 26%, German 22%, and Japanese 19%). In general, extractors’ confidence in translations was not associated with their accuracy. CONCLUSIONS: Translation by Google Translate generally required few resources. Based on our analysis of translations from five languages, using machine translation has the potential to reduce language bias in systematic reviews; however, pending additional empirical data, reviewers should be cautious about using translated data. There remains a trade-off between completeness of systematic reviews (including all available studies) and risk of error (due to poor translation). BioMed Central 2013-11-07 /pmc/articles/PMC4226266/ /pubmed/24199894 http://dx.doi.org/10.1186/2046-4053-2-97 Text en Copyright © 2013 Balk et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Balk, Ethan M Chung, Mei Chen, Minghua L Chang, Lina Kong Win Trikalinos, Thomas A Data extraction from machine-translated versus original language randomized trial reports: a comparative study
title	Data extraction from machine-translated versus original language randomized trial reports: a comparative study
title_full	Data extraction from machine-translated versus original language randomized trial reports: a comparative study
title_fullStr	Data extraction from machine-translated versus original language randomized trial reports: a comparative study
title_full_unstemmed	Data extraction from machine-translated versus original language randomized trial reports: a comparative study
title_short	Data extraction from machine-translated versus original language randomized trial reports: a comparative study
title_sort	data extraction from machine-translated versus original language randomized trial reports: a comparative study
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4226266/ https://www.ncbi.nlm.nih.gov/pubmed/24199894 http://dx.doi.org/10.1186/2046-4053-2-97
work_keys_str_mv	AT balkethanm dataextractionfrommachinetranslatedversusoriginallanguagerandomizedtrialreportsacomparativestudy AT chungmei dataextractionfrommachinetranslatedversusoriginallanguagerandomizedtrialreportsacomparativestudy AT chenminghual dataextractionfrommachinetranslatedversusoriginallanguagerandomizedtrialreportsacomparativestudy AT changlinakongwin dataextractionfrommachinetranslatedversusoriginallanguagerandomizedtrialreportsacomparativestudy AT trikalinosthomasa dataextractionfrommachinetranslatedversusoriginallanguagerandomizedtrialreportsacomparativestudy

Data extraction from machine-translated versus original language randomized trial reports: a comparative study

Ejemplares similares