Cargando…
Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition
Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the semantic c...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10529030/ https://www.ncbi.nlm.nih.gov/pubmed/37761593 http://dx.doi.org/10.3390/e25091294 |
_version_ | 1785111339586289664 |
---|---|
author | Fang, Zhongyang Cong, Yue Chai, Yuhan Gao, Chengliang Chen, Ximing Qiu, Jing |
author_facet | Fang, Zhongyang Cong, Yue Chai, Yuhan Gao, Chengliang Chen, Ximing Qiu, Jing |
author_sort | Fang, Zhongyang |
collection | PubMed |
description | Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the semantic completeness before any attempt to predict the discourse relation. However, word level embedding, widely used in existing works, may lead to a loss of semantics by splitting some phrases that should be treated as complete semantic units. In this article, we proposed three methods to segment a sentence into complete semantic units: a corpus-based method to serve as the baseline, a constituent parsing tree-based method, and a dependency parsing tree-based method to provide a more flexible and automatic way to divide the sentence. The segmented sentence will then be embedded at the level of semantic units so the embeddings could be fed into the IDRR networks and play the same role as word embeddings. We implemented our methods into one of the recent IDRR models to compare the performance with the original version using word level embeddings. Results show that proper embedding level better conserves the semantic information in the sentence and helps to enhance the performance of IDRR models. |
format | Online Article Text |
id | pubmed-10529030 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-105290302023-09-28 Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition Fang, Zhongyang Cong, Yue Chai, Yuhan Gao, Chengliang Chen, Ximing Qiu, Jing Entropy (Basel) Article Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the semantic completeness before any attempt to predict the discourse relation. However, word level embedding, widely used in existing works, may lead to a loss of semantics by splitting some phrases that should be treated as complete semantic units. In this article, we proposed three methods to segment a sentence into complete semantic units: a corpus-based method to serve as the baseline, a constituent parsing tree-based method, and a dependency parsing tree-based method to provide a more flexible and automatic way to divide the sentence. The segmented sentence will then be embedded at the level of semantic units so the embeddings could be fed into the IDRR networks and play the same role as word embeddings. We implemented our methods into one of the recent IDRR models to compare the performance with the original version using word level embeddings. Results show that proper embedding level better conserves the semantic information in the sentence and helps to enhance the performance of IDRR models. MDPI 2023-09-04 /pmc/articles/PMC10529030/ /pubmed/37761593 http://dx.doi.org/10.3390/e25091294 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Fang, Zhongyang Cong, Yue Chai, Yuhan Gao, Chengliang Chen, Ximing Qiu, Jing Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition |
title | Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition |
title_full | Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition |
title_fullStr | Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition |
title_full_unstemmed | Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition |
title_short | Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition |
title_sort | conserving semantic unit information and simplifying syntactic constituents to improve implicit discourse relation recognition |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10529030/ https://www.ncbi.nlm.nih.gov/pubmed/37761593 http://dx.doi.org/10.3390/e25091294 |
work_keys_str_mv | AT fangzhongyang conservingsemanticunitinformationandsimplifyingsyntacticconstituentstoimproveimplicitdiscourserelationrecognition AT congyue conservingsemanticunitinformationandsimplifyingsyntacticconstituentstoimproveimplicitdiscourserelationrecognition AT chaiyuhan conservingsemanticunitinformationandsimplifyingsyntacticconstituentstoimproveimplicitdiscourserelationrecognition AT gaochengliang conservingsemanticunitinformationandsimplifyingsyntacticconstituentstoimproveimplicitdiscourserelationrecognition AT chenximing conservingsemanticunitinformationandsimplifyingsyntacticconstituentstoimproveimplicitdiscourserelationrecognition AT qiujing conservingsemanticunitinformationandsimplifyingsyntacticconstituentstoimproveimplicitdiscourserelationrecognition |