Cargando…
Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records
ABSTRACT: BACKGROUND: Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a val...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9128253/ https://www.ncbi.nlm.nih.gov/pubmed/35606697 http://dx.doi.org/10.1186/s11689-022-09442-0 |
_version_ | 1784712525871316992 |
---|---|
author | Zhao, Mengge Havrilla, James Peng, Jacqueline Drye, Madison Fecher, Maddie Guthrie, Whitney Tunc, Birkan Schultz, Robert Wang, Kai Zhou, Yunyun |
author_facet | Zhao, Mengge Havrilla, James Peng, Jacqueline Drye, Madison Fecher, Maddie Guthrie, Whitney Tunc, Birkan Schultz, Robert Wang, Kai Zhou, Yunyun |
author_sort | Zhao, Mengge |
collection | PubMed |
description | ABSTRACT: BACKGROUND: Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a validated, standardized vocabulary to characterize clinical phenotypic presentation of ASD. Although the human phenotype ontology (HPO) plays an important role in delineating nuanced phenotypes for rare genetic diseases, it is inadequate to capture characteristic of behavioral and psychiatric phenotypes for individuals with ASD. There is a clear need, therefore, for a well-established phenotype terminology set that can assist in characterization of ASD phenotypes from patients’ clinical narratives. METHODS: To address this challenge, we used natural language processing (NLP) techniques to identify and curate ASD phenotypic terms from high-quality unstructured clinical notes in the electronic health record (EHR) on 8499 individuals with ASD, 8177 individuals with non-ASD psychiatric disorders, and 8482 individuals without a documented psychiatric disorder. We further performed dimensional reduction clustering analysis to subgroup individuals with ASD, using nonnegative matrix factorization method. RESULTS: Through a note-processing pipeline that includes several steps of state-of-the-art NLP approaches, we identified 3336 ASD terms linking to 1943 unique medical concepts, which represents among the largest ASD terminology set to date. The extracted ASD terms were further organized in a formal ontology structure similar to the HPO. Clustering analysis showed that these terms could be used in a diagnostic pipeline to differentiate individuals with ASD from individuals with other psychiatric disorders. CONCLUSION: Our ASD phenotype ontology can assist clinicians and researchers in characterizing individuals with ASD, facilitating automated diagnosis, and subtyping individuals with ASD to facilitate personalized therapeutic decision-making. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s11689-022-09442-0. |
format | Online Article Text |
id | pubmed-9128253 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-91282532022-05-25 Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records Zhao, Mengge Havrilla, James Peng, Jacqueline Drye, Madison Fecher, Maddie Guthrie, Whitney Tunc, Birkan Schultz, Robert Wang, Kai Zhou, Yunyun J Neurodev Disord Research ABSTRACT: BACKGROUND: Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a validated, standardized vocabulary to characterize clinical phenotypic presentation of ASD. Although the human phenotype ontology (HPO) plays an important role in delineating nuanced phenotypes for rare genetic diseases, it is inadequate to capture characteristic of behavioral and psychiatric phenotypes for individuals with ASD. There is a clear need, therefore, for a well-established phenotype terminology set that can assist in characterization of ASD phenotypes from patients’ clinical narratives. METHODS: To address this challenge, we used natural language processing (NLP) techniques to identify and curate ASD phenotypic terms from high-quality unstructured clinical notes in the electronic health record (EHR) on 8499 individuals with ASD, 8177 individuals with non-ASD psychiatric disorders, and 8482 individuals without a documented psychiatric disorder. We further performed dimensional reduction clustering analysis to subgroup individuals with ASD, using nonnegative matrix factorization method. RESULTS: Through a note-processing pipeline that includes several steps of state-of-the-art NLP approaches, we identified 3336 ASD terms linking to 1943 unique medical concepts, which represents among the largest ASD terminology set to date. The extracted ASD terms were further organized in a formal ontology structure similar to the HPO. Clustering analysis showed that these terms could be used in a diagnostic pipeline to differentiate individuals with ASD from individuals with other psychiatric disorders. CONCLUSION: Our ASD phenotype ontology can assist clinicians and researchers in characterizing individuals with ASD, facilitating automated diagnosis, and subtyping individuals with ASD to facilitate personalized therapeutic decision-making. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s11689-022-09442-0. BioMed Central 2022-05-23 /pmc/articles/PMC9128253/ /pubmed/35606697 http://dx.doi.org/10.1186/s11689-022-09442-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Zhao, Mengge Havrilla, James Peng, Jacqueline Drye, Madison Fecher, Maddie Guthrie, Whitney Tunc, Birkan Schultz, Robert Wang, Kai Zhou, Yunyun Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
title | Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
title_full | Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
title_fullStr | Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
title_full_unstemmed | Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
title_short | Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
title_sort | development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9128253/ https://www.ncbi.nlm.nih.gov/pubmed/35606697 http://dx.doi.org/10.1186/s11689-022-09442-0 |
work_keys_str_mv | AT zhaomengge developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT havrillajames developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT pengjacqueline developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT dryemadison developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT fechermaddie developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT guthriewhitney developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT tuncbirkan developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT schultzrobert developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT wangkai developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords AT zhouyunyun developmentofaphenotypeontologyforautismspectrumdisorderbynaturallanguageprocessingonelectronichealthrecords |