Cargando…

Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records

ABSTRACT: BACKGROUND: Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a val...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Mengge, Havrilla, James, Peng, Jacqueline, Drye, Madison, Fecher, Maddie, Guthrie, Whitney, Tunc, Birkan, Schultz, Robert, Wang, Kai, Zhou, Yunyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9128253/
https://www.ncbi.nlm.nih.gov/pubmed/35606697
http://dx.doi.org/10.1186/s11689-022-09442-0
_version_ 1784712525871316992
author Zhao, Mengge
Havrilla, James
Peng, Jacqueline
Drye, Madison
Fecher, Maddie
Guthrie, Whitney
Tunc, Birkan
Schultz, Robert
Wang, Kai
Zhou, Yunyun
author_facet Zhao, Mengge
Havrilla, James
Peng, Jacqueline
Drye, Madison
Fecher, Maddie
Guthrie, Whitney
Tunc, Birkan
Schultz, Robert
Wang, Kai
Zhou, Yunyun
author_sort Zhao, Mengge
collection PubMed
description ABSTRACT: BACKGROUND: Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a validated, standardized vocabulary to characterize clinical phenotypic presentation of ASD. Although the human phenotype ontology (HPO) plays an important role in delineating nuanced phenotypes for rare genetic diseases, it is inadequate to capture characteristic of behavioral and psychiatric phenotypes for individuals with ASD. There is a clear need, therefore, for a well-established phenotype terminology set that can assist in characterization of ASD phenotypes from patients’ clinical narratives. METHODS: To address this challenge, we used natural language processing (NLP) techniques to identify and curate ASD phenotypic terms from high-quality unstructured clinical notes in the electronic health record (EHR) on 8499 individuals with ASD, 8177 individuals with non-ASD psychiatric disorders, and 8482 individuals without a documented psychiatric disorder. We further performed dimensional reduction clustering analysis to subgroup individuals with ASD, using nonnegative matrix factorization method. RESULTS: Through a note-processing pipeline that includes several steps of state-of-the-art NLP approaches, we identified 3336 ASD terms linking to 1943 unique medical concepts, which represents among the largest ASD terminology set to date. The extracted ASD terms were further organized in a formal ontology structure similar to the HPO. Clustering analysis showed that these terms could be used in a diagnostic pipeline to differentiate individuals with ASD from individuals with other psychiatric disorders. CONCLUSION: Our ASD phenotype ontology can assist clinicians and researchers in characterizing individuals with ASD, facilitating automated diagnosis, and subtyping individuals with ASD to facilitate personalized therapeutic decision-making. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s11689-022-09442-0.
format Online
Article
Text
id pubmed-9128253
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91282532022-05-25 Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records Zhao, Mengge Havrilla, James Peng, Jacqueline Drye, Madison Fecher, Maddie Guthrie, Whitney Tunc, Birkan Schultz, Robert Wang, Kai Zhou, Yunyun J Neurodev Disord Research ABSTRACT: BACKGROUND: Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a validated, standardized vocabulary to characterize clinical phenotypic presentation of ASD. Although the human phenotype ontology (HPO) plays an important role in delineating nuanced phenotypes for rare genetic diseases, it is inadequate to capture characteristic of behavioral and psychiatric phenotypes for individuals with ASD. There is a clear need, therefore, for a well-established phenotype terminology set that can assist in characterization of ASD phenotypes from patients’ clinical narratives. METHODS: To address this challenge, we used natural language processing (NLP) techniques to identify and curate ASD phenotypic terms from high-quality unstructured clinical notes in the electronic health record (EHR) on 8499 individuals with ASD, 8177 individuals with non-ASD psychiatric disorders, and 8482 individuals without a documented psychiatric disorder. We further performed dimensional reduction clustering analysis to subgroup individuals with ASD, using nonnegative matrix factorization method. RESULTS: Through a note-processing pipeline that includes several steps of state-of-the-art NLP approaches, we identified 3336 ASD terms linking to 1943 unique medical concepts, which represents among the largest ASD terminology set to date. The extracted ASD terms were further organized in a formal ontology structure similar to the HPO. Clustering analysis showed that these terms could be used in a diagnostic pipeline to differentiate individuals with ASD from individuals with other psychiatric disorders. CONCLUSION: Our ASD phenotype ontology can assist clinicians and researchers in characterizing individuals with ASD, facilitating automated diagnosis, and subtyping individuals with ASD to facilitate personalized therapeutic decision-making. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s11689-022-09442-0. BioMed Central 2022-05-23 /pmc/articles/PMC9128253/ /pubmed/35606697 http://dx.doi.org/10.1186/s11689-022-09442-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zhao, Mengge
Havrilla, James
Peng, Jacqueline
Drye, Madison
Fecher, Maddie
Guthrie, Whitney
Tunc, Birkan
Schultz, Robert
Wang, Kai
Zhou, Yunyun
Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
title Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
title_full Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
title_fullStr Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
title_full_unstemmed Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
title_short Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
title_sort development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9128253/
https://www.ncbi.nlm.nih.gov/pubmed/35606697
http://dx.doi.org/10.1186/s11689-022-09442-0
work_keys_str_mv AT zhaomengge developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT havrillajames developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT pengjacqueline developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT dryemadison developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT fechermaddie developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT guthriewhitney developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT tuncbirkan developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT schultzrobert developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT wangkai developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords
AT zhouyunyun developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords