Cargando…

Novel classification for global gene signature model for predicting severity of systemic sclerosis

Progression of systemic scleroderma (SSc), a chronic connective tissue disease that causes a fibrotic phenotype, is highly heterogeneous amongst patients and difficult to accurately diagnose. To meet this clinical need, we developed a novel three-layer classification model, which analyses gene expre...

Descripción completa

Detalles Bibliográficos
Autores principales: Johnson, Zariel I., Jones, Jacqueline D., Mukherjee, Angana, Ren, Dianxu, Feghali-Bostwick, Carol, Conley, Yvette P., Yates, Cecelia C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6010260/
https://www.ncbi.nlm.nih.gov/pubmed/29924864
http://dx.doi.org/10.1371/journal.pone.0199314
_version_ 1783333547701960704
author Johnson, Zariel I.
Jones, Jacqueline D.
Mukherjee, Angana
Ren, Dianxu
Feghali-Bostwick, Carol
Conley, Yvette P.
Yates, Cecelia C.
author_facet Johnson, Zariel I.
Jones, Jacqueline D.
Mukherjee, Angana
Ren, Dianxu
Feghali-Bostwick, Carol
Conley, Yvette P.
Yates, Cecelia C.
author_sort Johnson, Zariel I.
collection PubMed
description Progression of systemic scleroderma (SSc), a chronic connective tissue disease that causes a fibrotic phenotype, is highly heterogeneous amongst patients and difficult to accurately diagnose. To meet this clinical need, we developed a novel three-layer classification model, which analyses gene expression profiles from SSc skin biopsies to diagnose SSc severity. Two SSc skin biopsy microarray datasets were obtained from Gene Expression Omnibus. The skin scores obtained from the original papers were used to further categorize the data into subgroups of low (<18) and high (≥18) severity. Data was pre-processed for normalization, background correction, centering and scaling. A two-layered cross-validation scheme was employed to objectively evaluate the performance of classification models of unobserved data. Three classification models were used: support vector machine, random forest, and naive Bayes in combination with feature selection methods to improve performance accuracy. For both input datasets, random forest classifier combined with correlation-based feature selection (CFS) method and naive Bayes combined with CFS or support vector machine based recursive feature elimination method yielded the best results. Additionally, we performed a principal component analysis to show that low and high severity groups are readily separable by gene expression signatures. Ultimately, we found that our novel classification prediction model produced global gene signatures that significantly correlated with skin scores. This study represents the first report comparing the performance of various classification prediction models for gene signatures from SSc patients, using current clinical diagnostic factors. In summary, our three-classification model system is a powerful tool for elucidating gene signatures from SSc skin biopsies and can also be used to develop a prognostic gene signature for SSc and other fibrotic disorders.
format Online
Article
Text
id pubmed-6010260
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-60102602018-07-06 Novel classification for global gene signature model for predicting severity of systemic sclerosis Johnson, Zariel I. Jones, Jacqueline D. Mukherjee, Angana Ren, Dianxu Feghali-Bostwick, Carol Conley, Yvette P. Yates, Cecelia C. PLoS One Research Article Progression of systemic scleroderma (SSc), a chronic connective tissue disease that causes a fibrotic phenotype, is highly heterogeneous amongst patients and difficult to accurately diagnose. To meet this clinical need, we developed a novel three-layer classification model, which analyses gene expression profiles from SSc skin biopsies to diagnose SSc severity. Two SSc skin biopsy microarray datasets were obtained from Gene Expression Omnibus. The skin scores obtained from the original papers were used to further categorize the data into subgroups of low (<18) and high (≥18) severity. Data was pre-processed for normalization, background correction, centering and scaling. A two-layered cross-validation scheme was employed to objectively evaluate the performance of classification models of unobserved data. Three classification models were used: support vector machine, random forest, and naive Bayes in combination with feature selection methods to improve performance accuracy. For both input datasets, random forest classifier combined with correlation-based feature selection (CFS) method and naive Bayes combined with CFS or support vector machine based recursive feature elimination method yielded the best results. Additionally, we performed a principal component analysis to show that low and high severity groups are readily separable by gene expression signatures. Ultimately, we found that our novel classification prediction model produced global gene signatures that significantly correlated with skin scores. This study represents the first report comparing the performance of various classification prediction models for gene signatures from SSc patients, using current clinical diagnostic factors. In summary, our three-classification model system is a powerful tool for elucidating gene signatures from SSc skin biopsies and can also be used to develop a prognostic gene signature for SSc and other fibrotic disorders. Public Library of Science 2018-06-20 /pmc/articles/PMC6010260/ /pubmed/29924864 http://dx.doi.org/10.1371/journal.pone.0199314 Text en © 2018 Johnson et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Johnson, Zariel I.
Jones, Jacqueline D.
Mukherjee, Angana
Ren, Dianxu
Feghali-Bostwick, Carol
Conley, Yvette P.
Yates, Cecelia C.
Novel classification for global gene signature model for predicting severity of systemic sclerosis
title Novel classification for global gene signature model for predicting severity of systemic sclerosis
title_full Novel classification for global gene signature model for predicting severity of systemic sclerosis
title_fullStr Novel classification for global gene signature model for predicting severity of systemic sclerosis
title_full_unstemmed Novel classification for global gene signature model for predicting severity of systemic sclerosis
title_short Novel classification for global gene signature model for predicting severity of systemic sclerosis
title_sort novel classification for global gene signature model for predicting severity of systemic sclerosis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6010260/
https://www.ncbi.nlm.nih.gov/pubmed/29924864
http://dx.doi.org/10.1371/journal.pone.0199314
work_keys_str_mv AT johnsonzarieli novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis
AT jonesjacquelined novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis
AT mukherjeeangana novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis
AT rendianxu novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis
AT feghalibostwickcarol novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis
AT conleyyvettep novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis
AT yatesceceliac novelclassificationforglobalgenesignaturemodelforpredictingseverityofsystemicsclerosis