Cargando…

EASIER corpus: A lexical simplification resource for people with cognitive impairments

Thanks to technologies such as the Internet and devices now available to people, we have increasingly greater access to larger quantities of information. However, people with ageing disabilities or intellectual disabilities, non-native speakers, and others have difficulties reading and understanding...

Descripción completa

Detalles Bibliográficos
Autores principales: Alarcon, Rodrigo, Moreno, Lourdes, Martínez, Paloma
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10096182/
https://www.ncbi.nlm.nih.gov/pubmed/37043424
http://dx.doi.org/10.1371/journal.pone.0283622
_version_ 1785024269635289088
author Alarcon, Rodrigo
Moreno, Lourdes
Martínez, Paloma
author_facet Alarcon, Rodrigo
Moreno, Lourdes
Martínez, Paloma
author_sort Alarcon, Rodrigo
collection PubMed
description Thanks to technologies such as the Internet and devices now available to people, we have increasingly greater access to larger quantities of information. However, people with ageing disabilities or intellectual disabilities, non-native speakers, and others have difficulties reading and understanding information. For this reason, it is essential to provide text simplification mechanisms when accessing information. Natural Language Processing methods can be applied to simplify textual content and improve understanding. These methods often use machine learning algorithms and models which require resources, such as corpora, to be trained and tested. This article presents the EASIER corpus, a resource that can be used to build lexical simplification methods to process Spanish domain-independent texts. The EASIER corpus is composed of 260 annotated documents with 8,155 words labelled as complex and 5,130 words with at least one proposed context-aware synonym associated. Expert linguists in easy-to-read and plain language guidelines have annotated the corpus based on their experience adapting texts for people with intellectual disabilities. Sixteen annotation guidelines that discriminate between complex and simple words have been defined to help other groups of experts to generate new annotations. Additionally, an inter-annotator agreement test was performed to validate the corpus, obtaining a Fleiss Kappa coefficient of 0.641. Furthermore, a qualitative evaluation was conducted with 45 users (including people with intellectual disabilities, elderly people, and a control audience). Complex word identification tasks achieved moderate results, but the synonyms proposed to replace complex words achieved almost perfect ratings. This resource has been integrated into the EASIER platform, a tool that helps people with cognitive impairments and intellectual disabilities to read and understand texts more easily.
format Online
Article
Text
id pubmed-10096182
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100961822023-04-13 EASIER corpus: A lexical simplification resource for people with cognitive impairments Alarcon, Rodrigo Moreno, Lourdes Martínez, Paloma PLoS One Research Article Thanks to technologies such as the Internet and devices now available to people, we have increasingly greater access to larger quantities of information. However, people with ageing disabilities or intellectual disabilities, non-native speakers, and others have difficulties reading and understanding information. For this reason, it is essential to provide text simplification mechanisms when accessing information. Natural Language Processing methods can be applied to simplify textual content and improve understanding. These methods often use machine learning algorithms and models which require resources, such as corpora, to be trained and tested. This article presents the EASIER corpus, a resource that can be used to build lexical simplification methods to process Spanish domain-independent texts. The EASIER corpus is composed of 260 annotated documents with 8,155 words labelled as complex and 5,130 words with at least one proposed context-aware synonym associated. Expert linguists in easy-to-read and plain language guidelines have annotated the corpus based on their experience adapting texts for people with intellectual disabilities. Sixteen annotation guidelines that discriminate between complex and simple words have been defined to help other groups of experts to generate new annotations. Additionally, an inter-annotator agreement test was performed to validate the corpus, obtaining a Fleiss Kappa coefficient of 0.641. Furthermore, a qualitative evaluation was conducted with 45 users (including people with intellectual disabilities, elderly people, and a control audience). Complex word identification tasks achieved moderate results, but the synonyms proposed to replace complex words achieved almost perfect ratings. This resource has been integrated into the EASIER platform, a tool that helps people with cognitive impairments and intellectual disabilities to read and understand texts more easily. Public Library of Science 2023-04-12 /pmc/articles/PMC10096182/ /pubmed/37043424 http://dx.doi.org/10.1371/journal.pone.0283622 Text en © 2023 Alarcon et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Alarcon, Rodrigo
Moreno, Lourdes
Martínez, Paloma
EASIER corpus: A lexical simplification resource for people with cognitive impairments
title EASIER corpus: A lexical simplification resource for people with cognitive impairments
title_full EASIER corpus: A lexical simplification resource for people with cognitive impairments
title_fullStr EASIER corpus: A lexical simplification resource for people with cognitive impairments
title_full_unstemmed EASIER corpus: A lexical simplification resource for people with cognitive impairments
title_short EASIER corpus: A lexical simplification resource for people with cognitive impairments
title_sort easier corpus: a lexical simplification resource for people with cognitive impairments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10096182/
https://www.ncbi.nlm.nih.gov/pubmed/37043424
http://dx.doi.org/10.1371/journal.pone.0283622
work_keys_str_mv AT alarconrodrigo easiercorpusalexicalsimplificationresourceforpeoplewithcognitiveimpairments
AT morenolourdes easiercorpusalexicalsimplificationresourceforpeoplewithcognitiveimpairments
AT martinezpaloma easiercorpusalexicalsimplificationresourceforpeoplewithcognitiveimpairments