Cargando…

Reading Akkadian cuneiform using natural language processing

In this paper we present a new method for automatic transliteration and segmentation of Unicode cuneiform glyphs using Natural Language Processing (NLP) techniques. Cuneiform is one of the earliest known writing system in the world, which documents millennia of human civilizations in the ancient Nea...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gordin, Shai, Gutherz, Gai, Elazary, Ariel, Romach, Avital, Jiménez, Enrique, Berant, Jonathan, Cohen, Yoram
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592802/ https://www.ncbi.nlm.nih.gov/pubmed/33112872 http://dx.doi.org/10.1371/journal.pone.0240511

_version_	1783601258227040256
author	Gordin, Shai Gutherz, Gai Elazary, Ariel Romach, Avital Jiménez, Enrique Berant, Jonathan Cohen, Yoram
author_facet	Gordin, Shai Gutherz, Gai Elazary, Ariel Romach, Avital Jiménez, Enrique Berant, Jonathan Cohen, Yoram
author_sort	Gordin, Shai
collection	PubMed
description	In this paper we present a new method for automatic transliteration and segmentation of Unicode cuneiform glyphs using Natural Language Processing (NLP) techniques. Cuneiform is one of the earliest known writing system in the world, which documents millennia of human civilizations in the ancient Near East. Hundreds of thousands of cuneiform texts were found in the nineteenth and twentieth centuries CE, most of which are written in Akkadian. However, there are still tens of thousands of texts to be published. We use models based on machine learning algorithms such as recurrent neural networks (RNN) with an accuracy reaching up to 97% for automatically transliterating and segmenting standard Unicode cuneiform glyphs into words. Therefore, our method and results form a major step towards creating a human-machine interface for creating digitized editions. Our code, Akkademia, is made publicly available for use via a web application, a python package, and a github repository.
format	Online Article Text
id	pubmed-7592802
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-75928022020-11-02 Reading Akkadian cuneiform using natural language processing Gordin, Shai Gutherz, Gai Elazary, Ariel Romach, Avital Jiménez, Enrique Berant, Jonathan Cohen, Yoram PLoS One Research Article In this paper we present a new method for automatic transliteration and segmentation of Unicode cuneiform glyphs using Natural Language Processing (NLP) techniques. Cuneiform is one of the earliest known writing system in the world, which documents millennia of human civilizations in the ancient Near East. Hundreds of thousands of cuneiform texts were found in the nineteenth and twentieth centuries CE, most of which are written in Akkadian. However, there are still tens of thousands of texts to be published. We use models based on machine learning algorithms such as recurrent neural networks (RNN) with an accuracy reaching up to 97% for automatically transliterating and segmenting standard Unicode cuneiform glyphs into words. Therefore, our method and results form a major step towards creating a human-machine interface for creating digitized editions. Our code, Akkademia, is made publicly available for use via a web application, a python package, and a github repository. Public Library of Science 2020-10-28 /pmc/articles/PMC7592802/ /pubmed/33112872 http://dx.doi.org/10.1371/journal.pone.0240511 Text en © 2020 Gordin et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Gordin, Shai Gutherz, Gai Elazary, Ariel Romach, Avital Jiménez, Enrique Berant, Jonathan Cohen, Yoram Reading Akkadian cuneiform using natural language processing
title	Reading Akkadian cuneiform using natural language processing
title_full	Reading Akkadian cuneiform using natural language processing
title_fullStr	Reading Akkadian cuneiform using natural language processing
title_full_unstemmed	Reading Akkadian cuneiform using natural language processing
title_short	Reading Akkadian cuneiform using natural language processing
title_sort	reading akkadian cuneiform using natural language processing
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592802/ https://www.ncbi.nlm.nih.gov/pubmed/33112872 http://dx.doi.org/10.1371/journal.pone.0240511
work_keys_str_mv	AT gordinshai readingakkadiancuneiformusingnaturallanguageprocessing AT gutherzgai readingakkadiancuneiformusingnaturallanguageprocessing AT elazaryariel readingakkadiancuneiformusingnaturallanguageprocessing AT romachavital readingakkadiancuneiformusingnaturallanguageprocessing AT jimenezenrique readingakkadiancuneiformusingnaturallanguageprocessing AT berantjonathan readingakkadiancuneiformusingnaturallanguageprocessing AT cohenyoram readingakkadiancuneiformusingnaturallanguageprocessing

Reading Akkadian cuneiform using natural language processing

Ejemplares similares