Cargando…

Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture

The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive em...

Descripción completa

Detalles Bibliográficos
Autores principales: Sundar, Arunima, Ramakrishnan, Akshay, Balaji, Avantika, Durairaj, Thenmozhi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Singapore 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600493/
https://www.ncbi.nlm.nih.gov/pubmed/34812423
http://dx.doi.org/10.1007/s42979-021-00943-8
_version_ 1784601164451414016
author Sundar, Arunima
Ramakrishnan, Akshay
Balaji, Avantika
Durairaj, Thenmozhi
author_facet Sundar, Arunima
Ramakrishnan, Akshay
Balaji, Avantika
Durairaj, Thenmozhi
author_sort Sundar, Arunima
collection PubMed
description The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive emotions in people. Students and working adults alike posit that they experience a lot of work-induced stress further proving that there exists a need for external inspiration which in this current scenario, is mostly found online. In this paper, we propose a multilingual model, with main emphasis on Dravidian languages, to automatically detect hope speech. We have employed a stacked encoder architecture which makes use of language agnostic cross-lingual word embeddings as the dataset consists of code-mixed YouTube comments. Additionally, we have carried out an empirical analysis and tested our architecture against various traditional, transformer, and transfer learning methods. Furthermore a k-fold paired t test was conducted which corroborates that our model outperforms the other approaches. Our methodology achieved an F1-score of 0.61 and 0.85 for Tamil and Malayalam, respectively. Our methodology is quite competitive to the state-of-the-art methods. The code for our work can be found in our GitHub repository (https://github.com/arunimasundar/Hope-Speech-LT-EDI).
format Online
Article
Text
id pubmed-8600493
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer Singapore
record_format MEDLINE/PubMed
spelling pubmed-86004932021-11-18 Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture Sundar, Arunima Ramakrishnan, Akshay Balaji, Avantika Durairaj, Thenmozhi SN Comput Sci Original Research The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive emotions in people. Students and working adults alike posit that they experience a lot of work-induced stress further proving that there exists a need for external inspiration which in this current scenario, is mostly found online. In this paper, we propose a multilingual model, with main emphasis on Dravidian languages, to automatically detect hope speech. We have employed a stacked encoder architecture which makes use of language agnostic cross-lingual word embeddings as the dataset consists of code-mixed YouTube comments. Additionally, we have carried out an empirical analysis and tested our architecture against various traditional, transformer, and transfer learning methods. Furthermore a k-fold paired t test was conducted which corroborates that our model outperforms the other approaches. Our methodology achieved an F1-score of 0.61 and 0.85 for Tamil and Malayalam, respectively. Our methodology is quite competitive to the state-of-the-art methods. The code for our work can be found in our GitHub repository (https://github.com/arunimasundar/Hope-Speech-LT-EDI). Springer Singapore 2021-11-18 2022 /pmc/articles/PMC8600493/ /pubmed/34812423 http://dx.doi.org/10.1007/s42979-021-00943-8 Text en © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research
Sundar, Arunima
Ramakrishnan, Akshay
Balaji, Avantika
Durairaj, Thenmozhi
Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
title Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
title_full Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
title_fullStr Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
title_full_unstemmed Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
title_short Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
title_sort hope speech detection for dravidian languages using cross-lingual embeddings with stacked encoder architecture
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600493/
https://www.ncbi.nlm.nih.gov/pubmed/34812423
http://dx.doi.org/10.1007/s42979-021-00943-8
work_keys_str_mv AT sundararunima hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture
AT ramakrishnanakshay hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture
AT balajiavantika hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture
AT durairajthenmozhi hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture