Cargando…

Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis

BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a debilitating, progressive disease with a median survival time of 3–5 years. Diagnosis remains challenging and disease progression varies greatly, suggesting the possibility of distinct subphenotypes. METHODS AND RESULTS: We analysed publicly avail...

Descripción completa

Detalles Bibliográficos
Autores principales: Shapanis, Andrew, Jones, Mark G, Schofield, James, Skipp, Paul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10314053/
https://www.ncbi.nlm.nih.gov/pubmed/36808085
http://dx.doi.org/10.1136/thorax-2022-219731
_version_ 1785067239953661952
author Shapanis, Andrew
Jones, Mark G
Schofield, James
Skipp, Paul
author_facet Shapanis, Andrew
Jones, Mark G
Schofield, James
Skipp, Paul
author_sort Shapanis, Andrew
collection PubMed
description BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a debilitating, progressive disease with a median survival time of 3–5 years. Diagnosis remains challenging and disease progression varies greatly, suggesting the possibility of distinct subphenotypes. METHODS AND RESULTS: We analysed publicly available peripheral blood mononuclear cell expression datasets for 219 IPF, 411 asthma, 362 tuberculosis, 151 healthy, 92 HIV and 83 other disease samples, totalling 1318 patients. We integrated the datasets and split them into train (n=871) and test (n=477) cohorts to investigate the utility of a machine learning model (support vector machine) for predicting IPF. A panel of 44 genes predicted IPF in a background of healthy, tuberculosis, HIV and asthma with an area under the curve of 0.9464, corresponding to a sensitivity of 0.865 and a specificity of 0.89. We then applied topological data analysis to investigate the possibility of subphenotypes within IPF. We identified five molecular subphenotypes of IPF, one of which corresponded to a phenotype enriched for death/transplant. The subphenotypes were molecularly characterised using bioinformatic and pathway analysis tools identifying distinct subphenotype features including one which suggests an extrapulmonary or systemic fibrotic disease. CONCLUSIONS: Integration of multiple datasets, from the same tissue, enabled the development of a model to accurately predict IPF using a panel of 44 genes. Furthermore, topological data analysis identified distinct subphenotypes of patients with IPF which were defined by differences in molecular pathobiology and clinical characteristics.
format Online
Article
Text
id pubmed-10314053
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-103140532023-07-02 Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis Shapanis, Andrew Jones, Mark G Schofield, James Skipp, Paul Thorax Interstitial Lung Disease BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a debilitating, progressive disease with a median survival time of 3–5 years. Diagnosis remains challenging and disease progression varies greatly, suggesting the possibility of distinct subphenotypes. METHODS AND RESULTS: We analysed publicly available peripheral blood mononuclear cell expression datasets for 219 IPF, 411 asthma, 362 tuberculosis, 151 healthy, 92 HIV and 83 other disease samples, totalling 1318 patients. We integrated the datasets and split them into train (n=871) and test (n=477) cohorts to investigate the utility of a machine learning model (support vector machine) for predicting IPF. A panel of 44 genes predicted IPF in a background of healthy, tuberculosis, HIV and asthma with an area under the curve of 0.9464, corresponding to a sensitivity of 0.865 and a specificity of 0.89. We then applied topological data analysis to investigate the possibility of subphenotypes within IPF. We identified five molecular subphenotypes of IPF, one of which corresponded to a phenotype enriched for death/transplant. The subphenotypes were molecularly characterised using bioinformatic and pathway analysis tools identifying distinct subphenotype features including one which suggests an extrapulmonary or systemic fibrotic disease. CONCLUSIONS: Integration of multiple datasets, from the same tissue, enabled the development of a model to accurately predict IPF using a panel of 44 genes. Furthermore, topological data analysis identified distinct subphenotypes of patients with IPF which were defined by differences in molecular pathobiology and clinical characteristics. BMJ Publishing Group 2023-07 2023-02-20 /pmc/articles/PMC10314053/ /pubmed/36808085 http://dx.doi.org/10.1136/thorax-2022-219731 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Interstitial Lung Disease
Shapanis, Andrew
Jones, Mark G
Schofield, James
Skipp, Paul
Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
title Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
title_full Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
title_fullStr Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
title_full_unstemmed Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
title_short Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
title_sort topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
topic Interstitial Lung Disease
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10314053/
https://www.ncbi.nlm.nih.gov/pubmed/36808085
http://dx.doi.org/10.1136/thorax-2022-219731
work_keys_str_mv AT shapanisandrew topologicaldataanalysisidentifiesmolecularphenotypesofidiopathicpulmonaryfibrosis
AT jonesmarkg topologicaldataanalysisidentifiesmolecularphenotypesofidiopathicpulmonaryfibrosis
AT schofieldjames topologicaldataanalysisidentifiesmolecularphenotypesofidiopathicpulmonaryfibrosis
AT skipppaul topologicaldataanalysisidentifiesmolecularphenotypesofidiopathicpulmonaryfibrosis