Cargando…
Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture
The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive em...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Singapore
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600493/ https://www.ncbi.nlm.nih.gov/pubmed/34812423 http://dx.doi.org/10.1007/s42979-021-00943-8 |
_version_ | 1784601164451414016 |
---|---|
author | Sundar, Arunima Ramakrishnan, Akshay Balaji, Avantika Durairaj, Thenmozhi |
author_facet | Sundar, Arunima Ramakrishnan, Akshay Balaji, Avantika Durairaj, Thenmozhi |
author_sort | Sundar, Arunima |
collection | PubMed |
description | The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive emotions in people. Students and working adults alike posit that they experience a lot of work-induced stress further proving that there exists a need for external inspiration which in this current scenario, is mostly found online. In this paper, we propose a multilingual model, with main emphasis on Dravidian languages, to automatically detect hope speech. We have employed a stacked encoder architecture which makes use of language agnostic cross-lingual word embeddings as the dataset consists of code-mixed YouTube comments. Additionally, we have carried out an empirical analysis and tested our architecture against various traditional, transformer, and transfer learning methods. Furthermore a k-fold paired t test was conducted which corroborates that our model outperforms the other approaches. Our methodology achieved an F1-score of 0.61 and 0.85 for Tamil and Malayalam, respectively. Our methodology is quite competitive to the state-of-the-art methods. The code for our work can be found in our GitHub repository (https://github.com/arunimasundar/Hope-Speech-LT-EDI). |
format | Online Article Text |
id | pubmed-8600493 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer Singapore |
record_format | MEDLINE/PubMed |
spelling | pubmed-86004932021-11-18 Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture Sundar, Arunima Ramakrishnan, Akshay Balaji, Avantika Durairaj, Thenmozhi SN Comput Sci Original Research The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive emotions in people. Students and working adults alike posit that they experience a lot of work-induced stress further proving that there exists a need for external inspiration which in this current scenario, is mostly found online. In this paper, we propose a multilingual model, with main emphasis on Dravidian languages, to automatically detect hope speech. We have employed a stacked encoder architecture which makes use of language agnostic cross-lingual word embeddings as the dataset consists of code-mixed YouTube comments. Additionally, we have carried out an empirical analysis and tested our architecture against various traditional, transformer, and transfer learning methods. Furthermore a k-fold paired t test was conducted which corroborates that our model outperforms the other approaches. Our methodology achieved an F1-score of 0.61 and 0.85 for Tamil and Malayalam, respectively. Our methodology is quite competitive to the state-of-the-art methods. The code for our work can be found in our GitHub repository (https://github.com/arunimasundar/Hope-Speech-LT-EDI). Springer Singapore 2021-11-18 2022 /pmc/articles/PMC8600493/ /pubmed/34812423 http://dx.doi.org/10.1007/s42979-021-00943-8 Text en © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Research Sundar, Arunima Ramakrishnan, Akshay Balaji, Avantika Durairaj, Thenmozhi Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture |
title | Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture |
title_full | Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture |
title_fullStr | Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture |
title_full_unstemmed | Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture |
title_short | Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture |
title_sort | hope speech detection for dravidian languages using cross-lingual embeddings with stacked encoder architecture |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600493/ https://www.ncbi.nlm.nih.gov/pubmed/34812423 http://dx.doi.org/10.1007/s42979-021-00943-8 |
work_keys_str_mv | AT sundararunima hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture AT ramakrishnanakshay hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture AT balajiavantika hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture AT durairajthenmozhi hopespeechdetectionfordravidianlanguagesusingcrosslingualembeddingswithstackedencoderarchitecture |