Cargando…

Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning

We describe and evaluate an automated approach used as part of the i2b2 2011 challenge to identify and categorise statements in suicide notes into one of 15 topics, including Love, Guilt, Thankfulness, Hopelessness and Instructions. The approach combines a set of lexico-syntactic rules with a set of...

Descripción completa

Detalles Bibliográficos
Autores principales: Kovačević, Aleksandar, Dehghan, Azad, Keane, John A., Nenadic, Goran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409492/
https://www.ncbi.nlm.nih.gov/pubmed/22879767
http://dx.doi.org/10.4137/BII.S8978
_version_ 1782239594850287616
author Kovačević, Aleksandar
Dehghan, Azad
Keane, John A.
Nenadic, Goran
author_facet Kovačević, Aleksandar
Dehghan, Azad
Keane, John A.
Nenadic, Goran
author_sort Kovačević, Aleksandar
collection PubMed
description We describe and evaluate an automated approach used as part of the i2b2 2011 challenge to identify and categorise statements in suicide notes into one of 15 topics, including Love, Guilt, Thankfulness, Hopelessness and Instructions. The approach combines a set of lexico-syntactic rules with a set of models derived by machine learning from a training dataset. The machine learning models rely on named entities, lexical, lexico-semantic and presentation features, as well as the rules that are applicable to a given statement. On a testing set of 300 suicide notes, the approach showed the overall best micro F-measure of up to 53.36%. The best precision achieved was 67.17% when only rules are used, whereas best recall of 50.57% was with integrated rules and machine learning. While some topics (eg, Sorrow, Anger, Blame) prove challenging, the performance for relatively frequent (eg, Love) and well-scoped categories (eg, Thankfulness) was comparatively higher (precision between 68% and 79%), suggesting that automated text mining approaches can be effective in topic categorisation of suicide notes.
format Online
Article
Text
id pubmed-3409492
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-34094922012-08-09 Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning Kovačević, Aleksandar Dehghan, Azad Keane, John A. Nenadic, Goran Biomed Inform Insights Original Research We describe and evaluate an automated approach used as part of the i2b2 2011 challenge to identify and categorise statements in suicide notes into one of 15 topics, including Love, Guilt, Thankfulness, Hopelessness and Instructions. The approach combines a set of lexico-syntactic rules with a set of models derived by machine learning from a training dataset. The machine learning models rely on named entities, lexical, lexico-semantic and presentation features, as well as the rules that are applicable to a given statement. On a testing set of 300 suicide notes, the approach showed the overall best micro F-measure of up to 53.36%. The best precision achieved was 67.17% when only rules are used, whereas best recall of 50.57% was with integrated rules and machine learning. While some topics (eg, Sorrow, Anger, Blame) prove challenging, the performance for relatively frequent (eg, Love) and well-scoped categories (eg, Thankfulness) was comparatively higher (precision between 68% and 79%), suggesting that automated text mining approaches can be effective in topic categorisation of suicide notes. Libertas Academica 2012-01-30 /pmc/articles/PMC3409492/ /pubmed/22879767 http://dx.doi.org/10.4137/BII.S8978 Text en © the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article. Unrestricted non-commercial use is permitted provided the original work is properly cited.
spellingShingle Original Research
Kovačević, Aleksandar
Dehghan, Azad
Keane, John A.
Nenadic, Goran
Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning
title Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning
title_full Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning
title_fullStr Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning
title_full_unstemmed Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning
title_short Topic Categorisation of Statements in Suicide Notes with Integrated Rules and Machine Learning
title_sort topic categorisation of statements in suicide notes with integrated rules and machine learning
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409492/
https://www.ncbi.nlm.nih.gov/pubmed/22879767
http://dx.doi.org/10.4137/BII.S8978
work_keys_str_mv AT kovacevicaleksandar topiccategorisationofstatementsinsuicidenoteswithintegratedrulesandmachinelearning
AT dehghanazad topiccategorisationofstatementsinsuicidenoteswithintegratedrulesandmachinelearning
AT keanejohna topiccategorisationofstatementsinsuicidenoteswithintegratedrulesandmachinelearning
AT nenadicgoran topiccategorisationofstatementsinsuicidenoteswithintegratedrulesandmachinelearning