Cargando…

Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders

Computerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Sunny X., Kriz, Reno, Cho, Sunghye, Park, Suh Jung, Harowitz, Jenna, Gur, Raquel E., Bhati, Mahendra T., Wolf, Daniel H., Sedoc, João, Liberman, Mark Y.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8121795/
https://www.ncbi.nlm.nih.gov/pubmed/33990615
http://dx.doi.org/10.1038/s41537-021-00154-3
_version_ 1783692449132052480
author Tang, Sunny X.
Kriz, Reno
Cho, Sunghye
Park, Suh Jung
Harowitz, Jenna
Gur, Raquel E.
Bhati, Mahendra T.
Wolf, Daniel H.
Sedoc, João
Liberman, Mark Y.
author_facet Tang, Sunny X.
Kriz, Reno
Cho, Sunghye
Park, Suh Jung
Harowitz, Jenna
Gur, Raquel E.
Bhati, Mahendra T.
Wolf, Daniel H.
Sedoc, João
Liberman, Mark Y.
author_sort Tang, Sunny X.
collection PubMed
description Computerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11) and approached linguistic phenotyping on three levels: individual words, parts-of-speech (POS), and sentence-level coherence. NLP features were compared with a clinical gold standard, the Scale for the Assessment of Thought, Language and Communication (TLC). We utilized Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art embedding algorithm incorporating bidirectional context. Through the POS approach, we found that SSD used more pronouns but fewer adverbs, adjectives, and determiners (e.g., “the,” “a,”). Analysis of individual word usage was notable for more frequent use of first-person singular pronouns among individuals with SSD and first-person plural pronouns among HC. There was a striking increase in incomplete words among SSD. Sentence-level analysis using BERT reflected increased tangentiality among SSD with greater sentence embedding distances. The SSD sample had low speech disturbance on average and there was no difference in group means for TLC scores. However, NLP measures of language disturbance appear to be sensitive to these subclinical differences and showed greater ability to discriminate between HC and SSD than a model based on clinical ratings alone. These intriguing exploratory results from a small sample prompt further inquiry into NLP methods for characterizing language disturbance in SSD and suggest that NLP measures may yield clinically relevant and informative biomarkers.
format Online
Article
Text
id pubmed-8121795
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-81217952021-05-17 Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders Tang, Sunny X. Kriz, Reno Cho, Sunghye Park, Suh Jung Harowitz, Jenna Gur, Raquel E. Bhati, Mahendra T. Wolf, Daniel H. Sedoc, João Liberman, Mark Y. NPJ Schizophr Article Computerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11) and approached linguistic phenotyping on three levels: individual words, parts-of-speech (POS), and sentence-level coherence. NLP features were compared with a clinical gold standard, the Scale for the Assessment of Thought, Language and Communication (TLC). We utilized Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art embedding algorithm incorporating bidirectional context. Through the POS approach, we found that SSD used more pronouns but fewer adverbs, adjectives, and determiners (e.g., “the,” “a,”). Analysis of individual word usage was notable for more frequent use of first-person singular pronouns among individuals with SSD and first-person plural pronouns among HC. There was a striking increase in incomplete words among SSD. Sentence-level analysis using BERT reflected increased tangentiality among SSD with greater sentence embedding distances. The SSD sample had low speech disturbance on average and there was no difference in group means for TLC scores. However, NLP measures of language disturbance appear to be sensitive to these subclinical differences and showed greater ability to discriminate between HC and SSD than a model based on clinical ratings alone. These intriguing exploratory results from a small sample prompt further inquiry into NLP methods for characterizing language disturbance in SSD and suggest that NLP measures may yield clinically relevant and informative biomarkers. Nature Publishing Group UK 2021-05-14 /pmc/articles/PMC8121795/ /pubmed/33990615 http://dx.doi.org/10.1038/s41537-021-00154-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Tang, Sunny X.
Kriz, Reno
Cho, Sunghye
Park, Suh Jung
Harowitz, Jenna
Gur, Raquel E.
Bhati, Mahendra T.
Wolf, Daniel H.
Sedoc, João
Liberman, Mark Y.
Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
title Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
title_full Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
title_fullStr Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
title_full_unstemmed Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
title_short Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
title_sort natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8121795/
https://www.ncbi.nlm.nih.gov/pubmed/33990615
http://dx.doi.org/10.1038/s41537-021-00154-3
work_keys_str_mv AT tangsunnyx naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT krizreno naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT chosunghye naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT parksuhjung naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT harowitzjenna naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT gurraquele naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT bhatimahendrat naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT wolfdanielh naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT sedocjoao naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders
AT libermanmarky naturallanguageprocessingmethodsaresensitivetosubclinicallinguisticdifferencesinschizophreniaspectrumdisorders