Cargando…

Data-driven identification of ageing-related diseases from electronic health records

Reducing the burden of late-life morbidity requires an understanding of the mechanisms of ageing-related diseases (ARDs), defined as diseases that accumulate with increasing age. This has been hampered by the lack of formal criteria to identify ARDs. Here, we present a framework to identify ARDs usi...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuan, Valerie, Fraser, Helen C., Hingorani, Melanie, Denaxas, Spiros, Gonzalez-Izquierdo, Arturo, Direk, Kenan, Nitsch, Dorothea, Mathur, Rohini, Parisinos, Constantinos A., Lumbers, R. Thomas, Sofat, Reecha, Wong, Ian C. K., Casas, Juan P., Thornton, Janet M., Hemingway, Harry, Partridge, Linda, Hingorani, Aroon D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7859412/
https://www.ncbi.nlm.nih.gov/pubmed/33536532
http://dx.doi.org/10.1038/s41598-021-82459-y
_version_ 1783646727327186944
author Kuan, Valerie
Fraser, Helen C.
Hingorani, Melanie
Denaxas, Spiros
Gonzalez-Izquierdo, Arturo
Direk, Kenan
Nitsch, Dorothea
Mathur, Rohini
Parisinos, Constantinos A.
Lumbers, R. Thomas
Sofat, Reecha
Wong, Ian C. K.
Casas, Juan P.
Thornton, Janet M.
Hemingway, Harry
Partridge, Linda
Hingorani, Aroon D.
author_facet Kuan, Valerie
Fraser, Helen C.
Hingorani, Melanie
Denaxas, Spiros
Gonzalez-Izquierdo, Arturo
Direk, Kenan
Nitsch, Dorothea
Mathur, Rohini
Parisinos, Constantinos A.
Lumbers, R. Thomas
Sofat, Reecha
Wong, Ian C. K.
Casas, Juan P.
Thornton, Janet M.
Hemingway, Harry
Partridge, Linda
Hingorani, Aroon D.
author_sort Kuan, Valerie
collection PubMed
description Reducing the burden of late-life morbidity requires an understanding of the mechanisms of ageing-related diseases (ARDs), defined as diseases that accumulate with increasing age. This has been hampered by the lack of formal criteria to identify ARDs. Here, we present a framework to identify ARDs using two complementary methods consisting of unsupervised machine learning and actuarial techniques, which we applied to electronic health records (EHRs) from 3,009,048 individuals in England using primary care data from the Clinical Practice Research Datalink (CPRD) linked to the Hospital Episode Statistics admitted patient care dataset between 1 April 2010 and 31 March 2015 (mean age 49.7 years (s.d. 18.6), 51% female, 70% white ethnicity). We grouped 278 high-burden diseases into nine main clusters according to their patterns of disease onset, using a hierarchical agglomerative clustering algorithm. Four of these clusters, encompassing 207 diseases spanning diverse organ systems and clinical specialties, had rates of disease onset that clearly increased with chronological age. However, the ages of onset for these four clusters were strikingly different, with median age of onset 82 years (IQR 82–83) for Cluster 1, 77 years (IQR 75–77) for Cluster 2, 69 years (IQR 66–71) for Cluster 3 and 57 years (IQR 54–59) for Cluster 4. Fitting to ageing-related actuarial models confirmed that the vast majority of these 207 diseases had a high probability of being ageing-related. Cardiovascular diseases and cancers were highly represented, while benign neoplastic, skin and psychiatric conditions were largely absent from the four ageing-related clusters. Our framework identifies and clusters ARDs and can form the basis for fundamental and translational research into ageing pathways.
format Online
Article
Text
id pubmed-7859412
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-78594122021-02-05 Data-driven identification of ageing-related diseases from electronic health records Kuan, Valerie Fraser, Helen C. Hingorani, Melanie Denaxas, Spiros Gonzalez-Izquierdo, Arturo Direk, Kenan Nitsch, Dorothea Mathur, Rohini Parisinos, Constantinos A. Lumbers, R. Thomas Sofat, Reecha Wong, Ian C. K. Casas, Juan P. Thornton, Janet M. Hemingway, Harry Partridge, Linda Hingorani, Aroon D. Sci Rep Article Reducing the burden of late-life morbidity requires an understanding of the mechanisms of ageing-related diseases (ARDs), defined as diseases that accumulate with increasing age. This has been hampered by the lack of formal criteria to identify ARDs. Here, we present a framework to identify ARDs using two complementary methods consisting of unsupervised machine learning and actuarial techniques, which we applied to electronic health records (EHRs) from 3,009,048 individuals in England using primary care data from the Clinical Practice Research Datalink (CPRD) linked to the Hospital Episode Statistics admitted patient care dataset between 1 April 2010 and 31 March 2015 (mean age 49.7 years (s.d. 18.6), 51% female, 70% white ethnicity). We grouped 278 high-burden diseases into nine main clusters according to their patterns of disease onset, using a hierarchical agglomerative clustering algorithm. Four of these clusters, encompassing 207 diseases spanning diverse organ systems and clinical specialties, had rates of disease onset that clearly increased with chronological age. However, the ages of onset for these four clusters were strikingly different, with median age of onset 82 years (IQR 82–83) for Cluster 1, 77 years (IQR 75–77) for Cluster 2, 69 years (IQR 66–71) for Cluster 3 and 57 years (IQR 54–59) for Cluster 4. Fitting to ageing-related actuarial models confirmed that the vast majority of these 207 diseases had a high probability of being ageing-related. Cardiovascular diseases and cancers were highly represented, while benign neoplastic, skin and psychiatric conditions were largely absent from the four ageing-related clusters. Our framework identifies and clusters ARDs and can form the basis for fundamental and translational research into ageing pathways. Nature Publishing Group UK 2021-02-03 /pmc/articles/PMC7859412/ /pubmed/33536532 http://dx.doi.org/10.1038/s41598-021-82459-y Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Kuan, Valerie
Fraser, Helen C.
Hingorani, Melanie
Denaxas, Spiros
Gonzalez-Izquierdo, Arturo
Direk, Kenan
Nitsch, Dorothea
Mathur, Rohini
Parisinos, Constantinos A.
Lumbers, R. Thomas
Sofat, Reecha
Wong, Ian C. K.
Casas, Juan P.
Thornton, Janet M.
Hemingway, Harry
Partridge, Linda
Hingorani, Aroon D.
Data-driven identification of ageing-related diseases from electronic health records
title Data-driven identification of ageing-related diseases from electronic health records
title_full Data-driven identification of ageing-related diseases from electronic health records
title_fullStr Data-driven identification of ageing-related diseases from electronic health records
title_full_unstemmed Data-driven identification of ageing-related diseases from electronic health records
title_short Data-driven identification of ageing-related diseases from electronic health records
title_sort data-driven identification of ageing-related diseases from electronic health records
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7859412/
https://www.ncbi.nlm.nih.gov/pubmed/33536532
http://dx.doi.org/10.1038/s41598-021-82459-y
work_keys_str_mv AT kuanvalerie datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT fraserhelenc datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT hingoranimelanie datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT denaxasspiros datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT gonzalezizquierdoarturo datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT direkkenan datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT nitschdorothea datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT mathurrohini datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT parisinosconstantinosa datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT lumbersrthomas datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT sofatreecha datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT wongianck datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT casasjuanp datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT thorntonjanetm datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT hemingwayharry datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT partridgelinda datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords
AT hingoraniaroond datadrivenidentificationofageingrelateddiseasesfromelectronichealthrecords