Cargando…

Data-driven discovery of seasonally linked diseases from an Electronic Health Records system

BACKGROUND: Patterns of disease incidence can identify new risk factors for the disease or provide insight into the etiology. For example, allergies and infectious diseases have been shown to follow periodic temporal patterns due to seasonal changes in environmental or infectious agents. Previous wo...

Descripción completa

Detalles Bibliográficos
Autores principales: Melamed, Rachel D, Khiabanian, Hossein, Rabadan, Raul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4158606/
https://www.ncbi.nlm.nih.gov/pubmed/25078762
_version_ 1782334090022748160
author Melamed, Rachel D
Khiabanian, Hossein
Rabadan, Raul
author_facet Melamed, Rachel D
Khiabanian, Hossein
Rabadan, Raul
author_sort Melamed, Rachel D
collection PubMed
description BACKGROUND: Patterns of disease incidence can identify new risk factors for the disease or provide insight into the etiology. For example, allergies and infectious diseases have been shown to follow periodic temporal patterns due to seasonal changes in environmental or infectious agents. Previous work searching for seasonal or other temporal patterns in disease diagnosis rates has been limited both in the scope of the diseases examined and in the ability to distinguish unexpected seasonal patterns. Electronic Health Records (EHR) compile extensive longitudinal clinical information, constituting a unique source for discovery of trends in occurrence of disease. However, the data suffer from inherent biases that preclude a identification of temporal trends. METHODS: Motivated by observation of the biases in this data source, we developed a method (Lomb-Scargle periodograms in detrended data, LSP-detrend) to find periodic patterns by adjusting the temporal information for broad trends in incidence, as well as seasonal changes in total hospitalizations. LSP-detrend can sensitively uncover periodic temporal patterns in the corrected data and identify the significance of the trend. We apply LSP-detrend to a compilation of records from 1.5 million patients encoded by ICD-9-CM (International Classification of Diseases, Ninth Revision, Clinical Modification), including 2,805 disorders with more than 500 occurrences across a 12 year period, recorded from 1.5 million patients. RESULTS AND CONCLUSIONS: Although EHR data, and ICD-9 coded records in particular, were not created with the intention of aggregated use for research, these data can in fact be mined for periodic patterns in incidence of disease, if confounders are properly removed. Of all diagnoses, around 10% are identified as seasonal by LSP-detrend, including many known phenomena. We robustly reproduce previous findings, even for relatively rare diseases. For instance, Kawasaki disease, a rare childhood disease that has been associated with weather patterns, is detected as strongly linked with winter months. Among the novel results, we find a bi-annual increase in exacerbations of myasthenia gravis, a potentially life threatening complication of an autoimmune disease. We dissect the causes of this seasonal incidence and propose that factors predisposing patients to this event vary through the year.
format Online
Article
Text
id pubmed-4158606
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41586062014-09-22 Data-driven discovery of seasonally linked diseases from an Electronic Health Records system Melamed, Rachel D Khiabanian, Hossein Rabadan, Raul BMC Bioinformatics Research BACKGROUND: Patterns of disease incidence can identify new risk factors for the disease or provide insight into the etiology. For example, allergies and infectious diseases have been shown to follow periodic temporal patterns due to seasonal changes in environmental or infectious agents. Previous work searching for seasonal or other temporal patterns in disease diagnosis rates has been limited both in the scope of the diseases examined and in the ability to distinguish unexpected seasonal patterns. Electronic Health Records (EHR) compile extensive longitudinal clinical information, constituting a unique source for discovery of trends in occurrence of disease. However, the data suffer from inherent biases that preclude a identification of temporal trends. METHODS: Motivated by observation of the biases in this data source, we developed a method (Lomb-Scargle periodograms in detrended data, LSP-detrend) to find periodic patterns by adjusting the temporal information for broad trends in incidence, as well as seasonal changes in total hospitalizations. LSP-detrend can sensitively uncover periodic temporal patterns in the corrected data and identify the significance of the trend. We apply LSP-detrend to a compilation of records from 1.5 million patients encoded by ICD-9-CM (International Classification of Diseases, Ninth Revision, Clinical Modification), including 2,805 disorders with more than 500 occurrences across a 12 year period, recorded from 1.5 million patients. RESULTS AND CONCLUSIONS: Although EHR data, and ICD-9 coded records in particular, were not created with the intention of aggregated use for research, these data can in fact be mined for periodic patterns in incidence of disease, if confounders are properly removed. Of all diagnoses, around 10% are identified as seasonal by LSP-detrend, including many known phenomena. We robustly reproduce previous findings, even for relatively rare diseases. For instance, Kawasaki disease, a rare childhood disease that has been associated with weather patterns, is detected as strongly linked with winter months. Among the novel results, we find a bi-annual increase in exacerbations of myasthenia gravis, a potentially life threatening complication of an autoimmune disease. We dissect the causes of this seasonal incidence and propose that factors predisposing patients to this event vary through the year. BioMed Central 2014-05-16 /pmc/articles/PMC4158606/ /pubmed/25078762 Text en Copyright © 2014 Melamed et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Melamed, Rachel D
Khiabanian, Hossein
Rabadan, Raul
Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
title Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
title_full Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
title_fullStr Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
title_full_unstemmed Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
title_short Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
title_sort data-driven discovery of seasonally linked diseases from an electronic health records system
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4158606/
https://www.ncbi.nlm.nih.gov/pubmed/25078762
work_keys_str_mv AT melamedracheld datadrivendiscoveryofseasonallylinkeddiseasesfromanelectronichealthrecordssystem
AT khiabanianhossein datadrivendiscoveryofseasonallylinkeddiseasesfromanelectronichealthrecordssystem
AT rabadanraul datadrivendiscoveryofseasonallylinkeddiseasesfromanelectronichealthrecordssystem