Cargando…
Predicting phenotypes of asthma and eczema with machine learning
BACKGROUND: There is increasing recognition that asthma and eczema are heterogeneous diseases. We investigated the predictive ability of a spectrum of machine learning methods to disambiguate clinical sub-groups of asthma, wheeze and eczema, using a large heterogeneous set of attributes in an unsele...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4101570/ https://www.ncbi.nlm.nih.gov/pubmed/25077568 http://dx.doi.org/10.1186/1755-8794-7-S1-S7 |
_version_ | 1782480923159166976 |
---|---|
author | Prosperi, Mattia CF Marinho, Susana Simpson, Angela Custovic, Adnan Buchan, Iain E |
author_facet | Prosperi, Mattia CF Marinho, Susana Simpson, Angela Custovic, Adnan Buchan, Iain E |
author_sort | Prosperi, Mattia CF |
collection | PubMed |
description | BACKGROUND: There is increasing recognition that asthma and eczema are heterogeneous diseases. We investigated the predictive ability of a spectrum of machine learning methods to disambiguate clinical sub-groups of asthma, wheeze and eczema, using a large heterogeneous set of attributes in an unselected population. The aim was to identify to what extent such heterogeneous information can be combined to reveal specific clinical manifestations. METHODS: The study population comprised a cross-sectional sample of adults, and included representatives of the general population enriched by subjects with asthma. Linear and non-linear machine learning methods, from logistic regression to random forests, were fit on a large attribute set including demographic, clinical and laboratory features, genetic profiles and environmental exposures. Outcome of interest were asthma, wheeze and eczema encoded by different operational definitions. Model validation was performed via bootstrapping. RESULTS: The study population included 554 adults, 42% male, 38% previous or current smokers. Proportion of asthma, wheeze, and eczema diagnoses was 16.7%, 12.3%, and 21.7%, respectively. Models were fit on 223 non-genetic variables plus 215 single nucleotide polymorphisms. In general, non-linear models achieved higher sensitivity and specificity than other methods, especially for asthma and wheeze, less for eczema, with areas under receiver operating characteristic curve of 84%, 76% and 64%, respectively. Our findings confirm that allergen sensitisation and lung function characterise asthma better in combination than separately. The predictive ability of genetic markers alone is limited. For eczema, new predictors such as bio-impedance were discovered. CONCLUSIONS: More usefully-complex modelling is the key to a better understanding of disease mechanisms and personalised healthcare: further advances are likely with the incorporation of more factors/attributes and longitudinal measures. |
format | Online Article Text |
id | pubmed-4101570 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41015702014-07-18 Predicting phenotypes of asthma and eczema with machine learning Prosperi, Mattia CF Marinho, Susana Simpson, Angela Custovic, Adnan Buchan, Iain E BMC Med Genomics Research BACKGROUND: There is increasing recognition that asthma and eczema are heterogeneous diseases. We investigated the predictive ability of a spectrum of machine learning methods to disambiguate clinical sub-groups of asthma, wheeze and eczema, using a large heterogeneous set of attributes in an unselected population. The aim was to identify to what extent such heterogeneous information can be combined to reveal specific clinical manifestations. METHODS: The study population comprised a cross-sectional sample of adults, and included representatives of the general population enriched by subjects with asthma. Linear and non-linear machine learning methods, from logistic regression to random forests, were fit on a large attribute set including demographic, clinical and laboratory features, genetic profiles and environmental exposures. Outcome of interest were asthma, wheeze and eczema encoded by different operational definitions. Model validation was performed via bootstrapping. RESULTS: The study population included 554 adults, 42% male, 38% previous or current smokers. Proportion of asthma, wheeze, and eczema diagnoses was 16.7%, 12.3%, and 21.7%, respectively. Models were fit on 223 non-genetic variables plus 215 single nucleotide polymorphisms. In general, non-linear models achieved higher sensitivity and specificity than other methods, especially for asthma and wheeze, less for eczema, with areas under receiver operating characteristic curve of 84%, 76% and 64%, respectively. Our findings confirm that allergen sensitisation and lung function characterise asthma better in combination than separately. The predictive ability of genetic markers alone is limited. For eczema, new predictors such as bio-impedance were discovered. CONCLUSIONS: More usefully-complex modelling is the key to a better understanding of disease mechanisms and personalised healthcare: further advances are likely with the incorporation of more factors/attributes and longitudinal measures. BioMed Central 2014-05-08 /pmc/articles/PMC4101570/ /pubmed/25077568 http://dx.doi.org/10.1186/1755-8794-7-S1-S7 Text en Copyright © 2014 Prosperi et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Prosperi, Mattia CF Marinho, Susana Simpson, Angela Custovic, Adnan Buchan, Iain E Predicting phenotypes of asthma and eczema with machine learning |
title | Predicting phenotypes of asthma and eczema with machine learning |
title_full | Predicting phenotypes of asthma and eczema with machine learning |
title_fullStr | Predicting phenotypes of asthma and eczema with machine learning |
title_full_unstemmed | Predicting phenotypes of asthma and eczema with machine learning |
title_short | Predicting phenotypes of asthma and eczema with machine learning |
title_sort | predicting phenotypes of asthma and eczema with machine learning |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4101570/ https://www.ncbi.nlm.nih.gov/pubmed/25077568 http://dx.doi.org/10.1186/1755-8794-7-S1-S7 |
work_keys_str_mv | AT prosperimattiacf predictingphenotypesofasthmaandeczemawithmachinelearning AT marinhosusana predictingphenotypesofasthmaandeczemawithmachinelearning AT simpsonangela predictingphenotypesofasthmaandeczemawithmachinelearning AT custovicadnan predictingphenotypesofasthmaandeczemawithmachinelearning AT buchaniaine predictingphenotypesofasthmaandeczemawithmachinelearning |