Cargando…
Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system
INTRODUCTION: The predominance of English in scientific research has created hurdles for “non-native speakers” of English. Here we present a novel application of native language identification (NLI) for the assessment of medical-scientific writing. For this purpose, we created a novel classification...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5315297/ https://www.ncbi.nlm.nih.gov/pubmed/28212419 http://dx.doi.org/10.1371/journal.pone.0172338 |
_version_ | 1782508666773045248 |
---|---|
author | Gayle, Alberto Alexander Shimaoka, Motomu |
author_facet | Gayle, Alberto Alexander Shimaoka, Motomu |
author_sort | Gayle, Alberto Alexander |
collection | PubMed |
description | INTRODUCTION: The predominance of English in scientific research has created hurdles for “non-native speakers” of English. Here we present a novel application of native language identification (NLI) for the assessment of medical-scientific writing. For this purpose, we created a novel classification system whereby scoring would be based solely on text features found to be distinctive among native English speakers (NS) within a given context. We dubbed this the “Genuine Index” (GI). METHODOLOGY: This methodology was validated using a small set of journals in the field of pediatric oncology. Our dataset consisted of 5,907 abstracts, representing work from 77 countries. A support vector machine (SVM) was used to generate our model and for scoring. RESULTS: Accuracy, precision, and recall of the classification model were 93.3%, 93.7%, and 99.4%, respectively. Class specific F-scores were 96.5% for NS and 39.8% for our benchmark class, Japan. Overall kappa was calculated to be 37.2%. We found significant differences between countries with respect to the GI score. Significant correlation was found between GI scores and two validated objective measures of writing proficiency and readability. Two sets of key terms and phrases differentiating NS and non-native writing were identified. CONCLUSIONS: Our GI model was able to detect, with a high degree of reliability, subtle differences between the terms and phrasing used by native and non-native speakers in peer reviewed journals, in the field of pediatric oncology. In addition, L1 language transfer was found to be very likely to survive revision, especially in non-Western countries such as Japan. These findings show that even when the language used is technically correct, there may still be some phrasing or usage that impact quality. |
format | Online Article Text |
id | pubmed-5315297 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-53152972017-03-03 Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system Gayle, Alberto Alexander Shimaoka, Motomu PLoS One Research Article INTRODUCTION: The predominance of English in scientific research has created hurdles for “non-native speakers” of English. Here we present a novel application of native language identification (NLI) for the assessment of medical-scientific writing. For this purpose, we created a novel classification system whereby scoring would be based solely on text features found to be distinctive among native English speakers (NS) within a given context. We dubbed this the “Genuine Index” (GI). METHODOLOGY: This methodology was validated using a small set of journals in the field of pediatric oncology. Our dataset consisted of 5,907 abstracts, representing work from 77 countries. A support vector machine (SVM) was used to generate our model and for scoring. RESULTS: Accuracy, precision, and recall of the classification model were 93.3%, 93.7%, and 99.4%, respectively. Class specific F-scores were 96.5% for NS and 39.8% for our benchmark class, Japan. Overall kappa was calculated to be 37.2%. We found significant differences between countries with respect to the GI score. Significant correlation was found between GI scores and two validated objective measures of writing proficiency and readability. Two sets of key terms and phrases differentiating NS and non-native writing were identified. CONCLUSIONS: Our GI model was able to detect, with a high degree of reliability, subtle differences between the terms and phrasing used by native and non-native speakers in peer reviewed journals, in the field of pediatric oncology. In addition, L1 language transfer was found to be very likely to survive revision, especially in non-Western countries such as Japan. These findings show that even when the language used is technically correct, there may still be some phrasing or usage that impact quality. Public Library of Science 2017-02-17 /pmc/articles/PMC5315297/ /pubmed/28212419 http://dx.doi.org/10.1371/journal.pone.0172338 Text en © 2017 Gayle, Shimaoka http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Gayle, Alberto Alexander Shimaoka, Motomu Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system |
title | Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system |
title_full | Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system |
title_fullStr | Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system |
title_full_unstemmed | Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system |
title_short | Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system |
title_sort | evaluating the lexico-grammatical differences in the writing of native and non-native speakers of english in peer-reviewed medical journals in the field of pediatric oncology: creation of the genuine index scoring system |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5315297/ https://www.ncbi.nlm.nih.gov/pubmed/28212419 http://dx.doi.org/10.1371/journal.pone.0172338 |
work_keys_str_mv | AT gaylealbertoalexander evaluatingthelexicogrammaticaldifferencesinthewritingofnativeandnonnativespeakersofenglishinpeerreviewedmedicaljournalsinthefieldofpediatriconcologycreationofthegenuineindexscoringsystem AT shimaokamotomu evaluatingthelexicogrammaticaldifferencesinthewritingofnativeandnonnativespeakersofenglishinpeerreviewedmedicaljournalsinthefieldofpediatriconcologycreationofthegenuineindexscoringsystem |