Cargando…

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool

Lately, discourse structure has received considerable attention due to the benefits its application offers in several NLP tasks such as opinion mining, summarization, question answering, text simplification, among others. When automatically analyzing texts, discourse parsers typically perform two di...

Descripción completa

Detalles Bibliográficos
Autores principales: Atutxa, Aitziber, Bengoetxea, Kepa, Diaz de Ilarraza, Arantza, Iruskieta, Mikel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6726195/
https://www.ncbi.nlm.nih.gov/pubmed/31483814
http://dx.doi.org/10.1371/journal.pone.0221639
_version_ 1783449058091728896
author Atutxa, Aitziber
Bengoetxea, Kepa
Diaz de Ilarraza, Arantza
Iruskieta, Mikel
author_facet Atutxa, Aitziber
Bengoetxea, Kepa
Diaz de Ilarraza, Arantza
Iruskieta, Mikel
author_sort Atutxa, Aitziber
collection PubMed
description Lately, discourse structure has received considerable attention due to the benefits its application offers in several NLP tasks such as opinion mining, summarization, question answering, text simplification, among others. When automatically analyzing texts, discourse parsers typically perform two different tasks: i) identification of basic discourse units (text segmentation) ii) linking discourse units by means of discourse relations, building structures such as trees or graphs. The resulting discourse structures are, in general terms, accurate at intra-sentence discourse-level relations, however they fail to capture the correct inter-sentence relations. Detecting the main discourse unit (the Central Unit) is helpful for discourse analyzers (and also for manual annotation) in improving their results in rhetorical labeling. Bearing this in mind, we set out to build the first two steps of a discourse parser following a top-down strategy: i) to find discourse units, ii) to detect the Central Unit. The final step, i.e. assigning rhetorical relations, remains to be worked on in the immediate future. In accordance with this strategy, our paper presents a tool consisting of a discourse segmenter and an automatic Central Unit detector.
format Online
Article
Text
id pubmed-6726195
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-67261952019-09-16 Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool Atutxa, Aitziber Bengoetxea, Kepa Diaz de Ilarraza, Arantza Iruskieta, Mikel PLoS One Research Article Lately, discourse structure has received considerable attention due to the benefits its application offers in several NLP tasks such as opinion mining, summarization, question answering, text simplification, among others. When automatically analyzing texts, discourse parsers typically perform two different tasks: i) identification of basic discourse units (text segmentation) ii) linking discourse units by means of discourse relations, building structures such as trees or graphs. The resulting discourse structures are, in general terms, accurate at intra-sentence discourse-level relations, however they fail to capture the correct inter-sentence relations. Detecting the main discourse unit (the Central Unit) is helpful for discourse analyzers (and also for manual annotation) in improving their results in rhetorical labeling. Bearing this in mind, we set out to build the first two steps of a discourse parser following a top-down strategy: i) to find discourse units, ii) to detect the Central Unit. The final step, i.e. assigning rhetorical relations, remains to be worked on in the immediate future. In accordance with this strategy, our paper presents a tool consisting of a discourse segmenter and an automatic Central Unit detector. Public Library of Science 2019-09-04 /pmc/articles/PMC6726195/ /pubmed/31483814 http://dx.doi.org/10.1371/journal.pone.0221639 Text en © 2019 Atutxa et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Atutxa, Aitziber
Bengoetxea, Kepa
Diaz de Ilarraza, Arantza
Iruskieta, Mikel
Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool
title Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool
title_full Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool
title_fullStr Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool
title_full_unstemmed Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool
title_short Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool
title_sort towards a top-down approach for an automatic discourse analysis for basque: segmentation and central unit detection tool
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6726195/
https://www.ncbi.nlm.nih.gov/pubmed/31483814
http://dx.doi.org/10.1371/journal.pone.0221639
work_keys_str_mv AT atutxaaitziber towardsatopdownapproachforanautomaticdiscourseanalysisforbasquesegmentationandcentralunitdetectiontool
AT bengoetxeakepa towardsatopdownapproachforanautomaticdiscourseanalysisforbasquesegmentationandcentralunitdetectiontool
AT diazdeilarrazaarantza towardsatopdownapproachforanautomaticdiscourseanalysisforbasquesegmentationandcentralunitdetectiontool
AT iruskietamikel towardsatopdownapproachforanautomaticdiscourseanalysisforbasquesegmentationandcentralunitdetectiontool