Cargando…

EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition

The state-of-the-art systems for most natural language engineering tasks employ machine learning methods. Despite the improved performances of these systems, there is a lack of established methods for assessing the quality of their predictions. This work introduces a method for explaining the predic...

Descripción completa

Detalles Bibliográficos
Autores principales: Güngör, Onur, Güngör, Tunga, Uskudarli, Suzan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773252/
https://www.ncbi.nlm.nih.gov/pubmed/33378340
http://dx.doi.org/10.1371/journal.pone.0244179
_version_ 1783630021749899264
author Güngör, Onur
Güngör, Tunga
Uskudarli, Suzan
author_facet Güngör, Onur
Güngör, Tunga
Uskudarli, Suzan
author_sort Güngör, Onur
collection PubMed
description The state-of-the-art systems for most natural language engineering tasks employ machine learning methods. Despite the improved performances of these systems, there is a lack of established methods for assessing the quality of their predictions. This work introduces a method for explaining the predictions of any sequence-based natural language processing (NLP) task implemented with any model, neural or non-neural. Our method named EXSEQREG introduces the concept of region that links the prediction and features that are potentially important for the model. A region is a list of positions in the input sentence associated with a single prediction. Many NLP tasks are compatible with the proposed explanation method as regions can be formed according to the nature of the task. The method models the prediction probability differences that are induced by careful removal of features used by the model. The output of the method is a list of importance values. Each value signifies the impact of the corresponding feature on the prediction. The proposed method is demonstrated with a neural network based named entity recognition (NER) tagger using Turkish and Finnish datasets. A qualitative analysis of the explanations is presented. The results are validated with a procedure based on the mutual information score of each feature. We show that this method produces reasonable explanations and may be used for i) assessing the degree of the contribution of features regarding a specific prediction of the model, ii) exploring the features that played a significant role for a trained model when analyzed across the corpus.
format Online
Article
Text
id pubmed-7773252
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77732522021-01-07 EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition Güngör, Onur Güngör, Tunga Uskudarli, Suzan PLoS One Research Article The state-of-the-art systems for most natural language engineering tasks employ machine learning methods. Despite the improved performances of these systems, there is a lack of established methods for assessing the quality of their predictions. This work introduces a method for explaining the predictions of any sequence-based natural language processing (NLP) task implemented with any model, neural or non-neural. Our method named EXSEQREG introduces the concept of region that links the prediction and features that are potentially important for the model. A region is a list of positions in the input sentence associated with a single prediction. Many NLP tasks are compatible with the proposed explanation method as regions can be formed according to the nature of the task. The method models the prediction probability differences that are induced by careful removal of features used by the model. The output of the method is a list of importance values. Each value signifies the impact of the corresponding feature on the prediction. The proposed method is demonstrated with a neural network based named entity recognition (NER) tagger using Turkish and Finnish datasets. A qualitative analysis of the explanations is presented. The results are validated with a procedure based on the mutual information score of each feature. We show that this method produces reasonable explanations and may be used for i) assessing the degree of the contribution of features regarding a specific prediction of the model, ii) exploring the features that played a significant role for a trained model when analyzed across the corpus. Public Library of Science 2020-12-30 /pmc/articles/PMC7773252/ /pubmed/33378340 http://dx.doi.org/10.1371/journal.pone.0244179 Text en © 2020 Güngör et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Güngör, Onur
Güngör, Tunga
Uskudarli, Suzan
EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
title EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
title_full EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
title_fullStr EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
title_full_unstemmed EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
title_short EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
title_sort exseqreg: explaining sequence-based nlp tasks with regions with a case study using morphological features for named entity recognition
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773252/
https://www.ncbi.nlm.nih.gov/pubmed/33378340
http://dx.doi.org/10.1371/journal.pone.0244179
work_keys_str_mv AT gungoronur exseqregexplainingsequencebasednlptaskswithregionswithacasestudyusingmorphologicalfeaturesfornamedentityrecognition
AT gungortunga exseqregexplainingsequencebasednlptaskswithregionswithacasestudyusingmorphologicalfeaturesfornamedentityrecognition
AT uskudarlisuzan exseqregexplainingsequencebasednlptaskswithregionswithacasestudyusingmorphologicalfeaturesfornamedentityrecognition