Cargando…

Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease

Chronic obstructive pulmonary disease (COPD) was classified by the Centers for Disease Control and Prevention in 2014 as the 3(rd) leading cause of death in the United States (US). The main cause of COPD is exposure to tobacco smoke and air pollutants. Problems associated with COPD include under-dia...

Descripción completa

Detalles Bibliográficos
Autores principales: Rogers, Lavida R. K., Verlinde, Madison, Mias, George I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857915/
https://www.ncbi.nlm.nih.gov/pubmed/31730674
http://dx.doi.org/10.1371/journal.pone.0224750
_version_ 1783470848218234880
author Rogers, Lavida R. K.
Verlinde, Madison
Mias, George I.
author_facet Rogers, Lavida R. K.
Verlinde, Madison
Mias, George I.
author_sort Rogers, Lavida R. K.
collection PubMed
description Chronic obstructive pulmonary disease (COPD) was classified by the Centers for Disease Control and Prevention in 2014 as the 3(rd) leading cause of death in the United States (US). The main cause of COPD is exposure to tobacco smoke and air pollutants. Problems associated with COPD include under-diagnosis of the disease and an increase in the number of smokers worldwide. The goal of our study is to identify disease variability in the gene expression profiles of COPD subjects compared to controls, by reanalyzing pre-existing, publicly available microarray expression datasets. Our inclusion criteria for microarray datasets selected for smoking status, age and sex of blood donors reported. Our datasets used Affymetrix, Agilent microarray platforms (7 datasets, 1,262 samples). We re-analyzed the curated raw microarray expression data using R packages, and used Box-Cox power transformations to normalize datasets. To identify significant differentially expressed genes we used generalized least squares models with disease state, age, sex, smoking status and study as effects that also included binary interactions, followed by likelihood ratio tests (LRT). We found 3,315 statistically significant (Storey-adjusted q-value <0.05) differentially expressed genes with respect to disease state (COPD or control). We further filtered these genes for biological effect using results from LRT q-value <0.05 and model estimates’ 10% two-tailed quantiles of mean differences between COPD and control), to identify 679 genes. Through analysis of disease, sex, age, and also smoking status and disease interactions we identified differentially expressed genes involved in a variety of immune responses and cell processes in COPD. We also trained a logistic regression model using the common array genes as features, which enabled prediction of disease status with 81.7% accuracy. Our results give potential for improving the diagnosis of COPD through blood and highlight novel gene expression disease signatures.
format Online
Article
Text
id pubmed-6857915
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-68579152019-12-07 Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease Rogers, Lavida R. K. Verlinde, Madison Mias, George I. PLoS One Research Article Chronic obstructive pulmonary disease (COPD) was classified by the Centers for Disease Control and Prevention in 2014 as the 3(rd) leading cause of death in the United States (US). The main cause of COPD is exposure to tobacco smoke and air pollutants. Problems associated with COPD include under-diagnosis of the disease and an increase in the number of smokers worldwide. The goal of our study is to identify disease variability in the gene expression profiles of COPD subjects compared to controls, by reanalyzing pre-existing, publicly available microarray expression datasets. Our inclusion criteria for microarray datasets selected for smoking status, age and sex of blood donors reported. Our datasets used Affymetrix, Agilent microarray platforms (7 datasets, 1,262 samples). We re-analyzed the curated raw microarray expression data using R packages, and used Box-Cox power transformations to normalize datasets. To identify significant differentially expressed genes we used generalized least squares models with disease state, age, sex, smoking status and study as effects that also included binary interactions, followed by likelihood ratio tests (LRT). We found 3,315 statistically significant (Storey-adjusted q-value <0.05) differentially expressed genes with respect to disease state (COPD or control). We further filtered these genes for biological effect using results from LRT q-value <0.05 and model estimates’ 10% two-tailed quantiles of mean differences between COPD and control), to identify 679 genes. Through analysis of disease, sex, age, and also smoking status and disease interactions we identified differentially expressed genes involved in a variety of immune responses and cell processes in COPD. We also trained a logistic regression model using the common array genes as features, which enabled prediction of disease status with 81.7% accuracy. Our results give potential for improving the diagnosis of COPD through blood and highlight novel gene expression disease signatures. Public Library of Science 2019-11-15 /pmc/articles/PMC6857915/ /pubmed/31730674 http://dx.doi.org/10.1371/journal.pone.0224750 Text en © 2019 Rogers et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rogers, Lavida R. K.
Verlinde, Madison
Mias, George I.
Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
title Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
title_full Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
title_fullStr Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
title_full_unstemmed Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
title_short Gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
title_sort gene expression microarray public dataset reanalysis in chronic obstructive pulmonary disease
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857915/
https://www.ncbi.nlm.nih.gov/pubmed/31730674
http://dx.doi.org/10.1371/journal.pone.0224750
work_keys_str_mv AT rogerslavidark geneexpressionmicroarraypublicdatasetreanalysisinchronicobstructivepulmonarydisease
AT verlindemadison geneexpressionmicroarraypublicdatasetreanalysisinchronicobstructivepulmonarydisease
AT miasgeorgei geneexpressionmicroarraypublicdatasetreanalysisinchronicobstructivepulmonarydisease