Cargando…

Temporal information extraction from mental health records to identify duration of untreated psychosis

BACKGROUND: Duration of untreated psychosis (DUP) is an important clinical construct in the field of mental health, as longer DUP can be associated with worse intervention outcomes. DUP estimation requires knowledge about when psychosis symptoms first started (symptom onset), and when psychosis trea...

Descripción completa

Detalles Bibliográficos
Autores principales: Viani, Natalia, Kam, Joyce, Yin, Lucia, Bittar, André, Dutta, Rina, Patel, Rashmi, Stewart, Robert, Velupillai, Sumithra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7063705/
https://www.ncbi.nlm.nih.gov/pubmed/32156302
http://dx.doi.org/10.1186/s13326-020-00220-2
_version_ 1783504741240668160
author Viani, Natalia
Kam, Joyce
Yin, Lucia
Bittar, André
Dutta, Rina
Patel, Rashmi
Stewart, Robert
Velupillai, Sumithra
author_facet Viani, Natalia
Kam, Joyce
Yin, Lucia
Bittar, André
Dutta, Rina
Patel, Rashmi
Stewart, Robert
Velupillai, Sumithra
author_sort Viani, Natalia
collection PubMed
description BACKGROUND: Duration of untreated psychosis (DUP) is an important clinical construct in the field of mental health, as longer DUP can be associated with worse intervention outcomes. DUP estimation requires knowledge about when psychosis symptoms first started (symptom onset), and when psychosis treatment was initiated. Electronic health records (EHRs) represent a useful resource for retrospective clinical studies on DUP, but the core information underlying this construct is most likely to lie in free text, meaning it is not readily available for clinical research. Natural Language Processing (NLP) is a means to addressing this problem by automatically extracting relevant information in a structured form. As a first step, it is important to identify appropriate documents, i.e., those that are likely to include the information of interest. Next, temporal information extraction methods are needed to identify time references for early psychosis symptoms. This NLP challenge requires solving three different tasks: time expression extraction, symptom extraction, and temporal “linking”. In this study, we focus on the first step, using two relevant EHR datasets. RESULTS: We applied a rule-based NLP system for time expression extraction that we had previously adapted to a corpus of mental health EHRs from patients with a diagnosis of schizophrenia (first referrals). We extended this work by applying this NLP system to a larger set of documents and patients, to identify additional texts that would be relevant for our long-term goal, and developed a new corpus from a subset of these new texts (early intervention services). Furthermore, we added normalized value annotations (“2011–05”) to the annotated time expressions (“May 2011”) in both corpora. The finalized corpora were used for further NLP development and evaluation, with promising results (normalization accuracy 71–86%). To highlight the specificities of our annotation task, we also applied the final adapted NLP system to a different temporally annotated clinical corpus. CONCLUSIONS: Developing domain-specific methods is crucial to address complex NLP tasks such as symptom onset extraction and retrospective calculation of duration of a preclinical syndrome. To the best of our knowledge, this is the first clinical text resource annotated for temporal entities in the mental health domain.
format Online
Article
Text
id pubmed-7063705
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-70637052020-03-13 Temporal information extraction from mental health records to identify duration of untreated psychosis Viani, Natalia Kam, Joyce Yin, Lucia Bittar, André Dutta, Rina Patel, Rashmi Stewart, Robert Velupillai, Sumithra J Biomed Semantics Research BACKGROUND: Duration of untreated psychosis (DUP) is an important clinical construct in the field of mental health, as longer DUP can be associated with worse intervention outcomes. DUP estimation requires knowledge about when psychosis symptoms first started (symptom onset), and when psychosis treatment was initiated. Electronic health records (EHRs) represent a useful resource for retrospective clinical studies on DUP, but the core information underlying this construct is most likely to lie in free text, meaning it is not readily available for clinical research. Natural Language Processing (NLP) is a means to addressing this problem by automatically extracting relevant information in a structured form. As a first step, it is important to identify appropriate documents, i.e., those that are likely to include the information of interest. Next, temporal information extraction methods are needed to identify time references for early psychosis symptoms. This NLP challenge requires solving three different tasks: time expression extraction, symptom extraction, and temporal “linking”. In this study, we focus on the first step, using two relevant EHR datasets. RESULTS: We applied a rule-based NLP system for time expression extraction that we had previously adapted to a corpus of mental health EHRs from patients with a diagnosis of schizophrenia (first referrals). We extended this work by applying this NLP system to a larger set of documents and patients, to identify additional texts that would be relevant for our long-term goal, and developed a new corpus from a subset of these new texts (early intervention services). Furthermore, we added normalized value annotations (“2011–05”) to the annotated time expressions (“May 2011”) in both corpora. The finalized corpora were used for further NLP development and evaluation, with promising results (normalization accuracy 71–86%). To highlight the specificities of our annotation task, we also applied the final adapted NLP system to a different temporally annotated clinical corpus. CONCLUSIONS: Developing domain-specific methods is crucial to address complex NLP tasks such as symptom onset extraction and retrospective calculation of duration of a preclinical syndrome. To the best of our knowledge, this is the first clinical text resource annotated for temporal entities in the mental health domain. BioMed Central 2020-03-10 /pmc/articles/PMC7063705/ /pubmed/32156302 http://dx.doi.org/10.1186/s13326-020-00220-2 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Viani, Natalia
Kam, Joyce
Yin, Lucia
Bittar, André
Dutta, Rina
Patel, Rashmi
Stewart, Robert
Velupillai, Sumithra
Temporal information extraction from mental health records to identify duration of untreated psychosis
title Temporal information extraction from mental health records to identify duration of untreated psychosis
title_full Temporal information extraction from mental health records to identify duration of untreated psychosis
title_fullStr Temporal information extraction from mental health records to identify duration of untreated psychosis
title_full_unstemmed Temporal information extraction from mental health records to identify duration of untreated psychosis
title_short Temporal information extraction from mental health records to identify duration of untreated psychosis
title_sort temporal information extraction from mental health records to identify duration of untreated psychosis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7063705/
https://www.ncbi.nlm.nih.gov/pubmed/32156302
http://dx.doi.org/10.1186/s13326-020-00220-2
work_keys_str_mv AT vianinatalia temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT kamjoyce temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT yinlucia temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT bittarandre temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT duttarina temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT patelrashmi temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT stewartrobert temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis
AT velupillaisumithra temporalinformationextractionfrommentalhealthrecordstoidentifydurationofuntreatedpsychosis