Cargando…
Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach
We trace the evolution of Scientific English through the Late Modern period to modern time on the basis of a comprehensive corpus composed of the Transactions and Proceedings of the Royal Society of London, the first and longest-running English scientific journal established in 1665. Specifically, w...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861277/ https://www.ncbi.nlm.nih.gov/pubmed/33733190 http://dx.doi.org/10.3389/frai.2020.00073 |
_version_ | 1783647051399036928 |
---|---|
author | Bizzoni, Yuri Degaetano-Ortlieb, Stefania Fankhauser, Peter Teich, Elke |
author_facet | Bizzoni, Yuri Degaetano-Ortlieb, Stefania Fankhauser, Peter Teich, Elke |
author_sort | Bizzoni, Yuri |
collection | PubMed |
description | We trace the evolution of Scientific English through the Late Modern period to modern time on the basis of a comprehensive corpus composed of the Transactions and Proceedings of the Royal Society of London, the first and longest-running English scientific journal established in 1665. Specifically, we explore the linguistic imprints of specialization and diversification in the science domain which accumulate in the formation of “scientific language” and field-specific sublanguages/registers (chemistry, biology etc.). We pursue an exploratory, data-driven approach using state-of-the-art computational language models and combine them with selected information-theoretic measures (entropy, relative entropy) for comparing models along relevant dimensions of variation (time, register). Focusing on selected linguistic variables (lexis, grammar), we show how we deploy computational language models for capturing linguistic variation and change and discuss benefits and limitations. |
format | Online Article Text |
id | pubmed-7861277 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78612772021-03-16 Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach Bizzoni, Yuri Degaetano-Ortlieb, Stefania Fankhauser, Peter Teich, Elke Front Artif Intell Artificial Intelligence We trace the evolution of Scientific English through the Late Modern period to modern time on the basis of a comprehensive corpus composed of the Transactions and Proceedings of the Royal Society of London, the first and longest-running English scientific journal established in 1665. Specifically, we explore the linguistic imprints of specialization and diversification in the science domain which accumulate in the formation of “scientific language” and field-specific sublanguages/registers (chemistry, biology etc.). We pursue an exploratory, data-driven approach using state-of-the-art computational language models and combine them with selected information-theoretic measures (entropy, relative entropy) for comparing models along relevant dimensions of variation (time, register). Focusing on selected linguistic variables (lexis, grammar), we show how we deploy computational language models for capturing linguistic variation and change and discuss benefits and limitations. Frontiers Media S.A. 2020-09-16 /pmc/articles/PMC7861277/ /pubmed/33733190 http://dx.doi.org/10.3389/frai.2020.00073 Text en Copyright © 2020 Bizzoni, Degaetano-Ortlieb, Fankhauser and Teich. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Artificial Intelligence Bizzoni, Yuri Degaetano-Ortlieb, Stefania Fankhauser, Peter Teich, Elke Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach |
title | Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach |
title_full | Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach |
title_fullStr | Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach |
title_full_unstemmed | Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach |
title_short | Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach |
title_sort | linguistic variation and change in 250 years of english scientific writing: a data-driven approach |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861277/ https://www.ncbi.nlm.nih.gov/pubmed/33733190 http://dx.doi.org/10.3389/frai.2020.00073 |
work_keys_str_mv | AT bizzoniyuri linguisticvariationandchangein250yearsofenglishscientificwritingadatadrivenapproach AT degaetanoortliebstefania linguisticvariationandchangein250yearsofenglishscientificwritingadatadrivenapproach AT fankhauserpeter linguisticvariationandchangein250yearsofenglishscientificwritingadatadrivenapproach AT teichelke linguisticvariationandchangein250yearsofenglishscientificwritingadatadrivenapproach |