Cargando…
Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach
BACKGROUND: Chronic obstructive pulmonary disease (COPD) is combination of progressive lung diseases. The diagnosis of COPD is generally based on the pulmonary function testing, however, difficulties underlie in prognosis of smokers or early stage of COPD patients due to the complexity and heterogen...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6998147/ https://www.ncbi.nlm.nih.gov/pubmed/32013930 http://dx.doi.org/10.1186/s12890-020-1062-9 |
_version_ | 1783493807839379456 |
---|---|
author | Matsumura, Kazushi Ito, Shigeaki |
author_facet | Matsumura, Kazushi Ito, Shigeaki |
author_sort | Matsumura, Kazushi |
collection | PubMed |
description | BACKGROUND: Chronic obstructive pulmonary disease (COPD) is combination of progressive lung diseases. The diagnosis of COPD is generally based on the pulmonary function testing, however, difficulties underlie in prognosis of smokers or early stage of COPD patients due to the complexity and heterogeneity of the pathogenesis. Computational analyses of omics technologies are expected as one of the solutions to resolve such complexities. METHODS: We obtained transcriptomic data by in vitro testing with exposures of human bronchial epithelial cells to the inducers for early events of COPD to identify the potential descriptive marker genes. With the identified genes, the machine learning technique was employed with the publicly available transcriptome data obtained from the lung specimens of COPD and non-COPD patients to develop the model that can reflect the risk continuum across smoking and COPD. RESULTS: The expression levels of 15 genes were commonly altered among in vitro tissues exposed to known inducible factors for earlier events of COPD (exposure to cigarette smoke, DNA damage, oxidative stress, and inflammation), and 10 of these genes and their corresponding proteins have not previously reported as COPD biomarkers. Although these genes were able to predict each group with 65% accuracy, the accuracy with which they were able to discriminate COPD subjects from smokers was only 29%. Furthermore, logistic regression enabled the conversion of gene expression levels to a numerical index, which we named the “potential risk factor (PRF)” index. The highest significant index value was recorded in COPD subjects (0.56 at the median), followed by smokers (0.30) and non-smokers (0.02). In vitro tissues exposed to cigarette smoke displayed dose-dependent increases of PRF, suggesting its utility for prospective risk estimation of tobacco products. CONCLUSIONS: Our experimental-based transcriptomic analysis identified novel genes associated with COPD, and the 15 genes could distinguish smokers and COPD subjects from non-smokers via machine-learning classification with remarkable accuracy. We also suggested a PRF index that can quantitatively reflect the risk continuum across smoking and COPD pathogenesis, and we believe it will provide an improved understanding of smoking effects and new insights into COPD. |
format | Online Article Text |
id | pubmed-6998147 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69981472020-02-05 Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach Matsumura, Kazushi Ito, Shigeaki BMC Pulm Med Technical Advance BACKGROUND: Chronic obstructive pulmonary disease (COPD) is combination of progressive lung diseases. The diagnosis of COPD is generally based on the pulmonary function testing, however, difficulties underlie in prognosis of smokers or early stage of COPD patients due to the complexity and heterogeneity of the pathogenesis. Computational analyses of omics technologies are expected as one of the solutions to resolve such complexities. METHODS: We obtained transcriptomic data by in vitro testing with exposures of human bronchial epithelial cells to the inducers for early events of COPD to identify the potential descriptive marker genes. With the identified genes, the machine learning technique was employed with the publicly available transcriptome data obtained from the lung specimens of COPD and non-COPD patients to develop the model that can reflect the risk continuum across smoking and COPD. RESULTS: The expression levels of 15 genes were commonly altered among in vitro tissues exposed to known inducible factors for earlier events of COPD (exposure to cigarette smoke, DNA damage, oxidative stress, and inflammation), and 10 of these genes and their corresponding proteins have not previously reported as COPD biomarkers. Although these genes were able to predict each group with 65% accuracy, the accuracy with which they were able to discriminate COPD subjects from smokers was only 29%. Furthermore, logistic regression enabled the conversion of gene expression levels to a numerical index, which we named the “potential risk factor (PRF)” index. The highest significant index value was recorded in COPD subjects (0.56 at the median), followed by smokers (0.30) and non-smokers (0.02). In vitro tissues exposed to cigarette smoke displayed dose-dependent increases of PRF, suggesting its utility for prospective risk estimation of tobacco products. CONCLUSIONS: Our experimental-based transcriptomic analysis identified novel genes associated with COPD, and the 15 genes could distinguish smokers and COPD subjects from non-smokers via machine-learning classification with remarkable accuracy. We also suggested a PRF index that can quantitatively reflect the risk continuum across smoking and COPD pathogenesis, and we believe it will provide an improved understanding of smoking effects and new insights into COPD. BioMed Central 2020-02-03 /pmc/articles/PMC6998147/ /pubmed/32013930 http://dx.doi.org/10.1186/s12890-020-1062-9 Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Technical Advance Matsumura, Kazushi Ito, Shigeaki Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
title | Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
title_full | Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
title_fullStr | Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
title_full_unstemmed | Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
title_short | Novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
title_sort | novel biomarker genes which distinguish between smokers and chronic obstructive pulmonary disease patients with machine learning approach |
topic | Technical Advance |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6998147/ https://www.ncbi.nlm.nih.gov/pubmed/32013930 http://dx.doi.org/10.1186/s12890-020-1062-9 |
work_keys_str_mv | AT matsumurakazushi novelbiomarkergeneswhichdistinguishbetweensmokersandchronicobstructivepulmonarydiseasepatientswithmachinelearningapproach AT itoshigeaki novelbiomarkergeneswhichdistinguishbetweensmokersandchronicobstructivepulmonarydiseasepatientswithmachinelearningapproach |