Cargando…

Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables

Breast cancer is a heterogeneous disease. Although gene expression profiling has led to the definition of several subtypes of breast cancer, the precise discovery of the subtypes remains a challenge. Clinical data is another promising source. In this study, clinical variables are utilized and integr...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Zongzhen, Zhang, Junying, Yuan, Xiguo, Xi, Jianing, Liu, Zhaowen, Zhang, Yuanyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6385100/
https://www.ncbi.nlm.nih.gov/pubmed/30754661
http://dx.doi.org/10.3390/molecules24030631
_version_ 1783397127642152960
author He, Zongzhen
Zhang, Junying
Yuan, Xiguo
Xi, Jianing
Liu, Zhaowen
Zhang, Yuanyuan
author_facet He, Zongzhen
Zhang, Junying
Yuan, Xiguo
Xi, Jianing
Liu, Zhaowen
Zhang, Yuanyuan
author_sort He, Zongzhen
collection PubMed
description Breast cancer is a heterogeneous disease. Although gene expression profiling has led to the definition of several subtypes of breast cancer, the precise discovery of the subtypes remains a challenge. Clinical data is another promising source. In this study, clinical variables are utilized and integrated to gene expressions for the stratification of breast cancer. We adopt two phases: gene selection and clustering, where the integration is in the gene selection phase; only genes whose expressions are most relevant to each clinical variable and least redundant among themselves are selected for further clustering. In practice, we simply utilize maximum relevance minimum redundancy (mRMR) for gene selection and k-means for clustering. We compare the results of our method with those of two commonly used only expression-based breast cancer stratification methods: prediction analysis of microarray 50 (PAM50) and highest variability (HV). The result is that our method outperforms them in identifying subtypes significantly associated with five-year survival and recurrence time. Specifically, our method identified recurrence-associated breast cancer subtypes that were not identified by PAM50 and HV. Additionally, our analysis discovered three survival-associated luminal-A subgroups and two survival-associated luminal-B subgroups. The study indicates that screening clinically relevant gene expressions yields improved breast cancer stratification.
format Online
Article
Text
id pubmed-6385100
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-63851002019-02-23 Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables He, Zongzhen Zhang, Junying Yuan, Xiguo Xi, Jianing Liu, Zhaowen Zhang, Yuanyuan Molecules Article Breast cancer is a heterogeneous disease. Although gene expression profiling has led to the definition of several subtypes of breast cancer, the precise discovery of the subtypes remains a challenge. Clinical data is another promising source. In this study, clinical variables are utilized and integrated to gene expressions for the stratification of breast cancer. We adopt two phases: gene selection and clustering, where the integration is in the gene selection phase; only genes whose expressions are most relevant to each clinical variable and least redundant among themselves are selected for further clustering. In practice, we simply utilize maximum relevance minimum redundancy (mRMR) for gene selection and k-means for clustering. We compare the results of our method with those of two commonly used only expression-based breast cancer stratification methods: prediction analysis of microarray 50 (PAM50) and highest variability (HV). The result is that our method outperforms them in identifying subtypes significantly associated with five-year survival and recurrence time. Specifically, our method identified recurrence-associated breast cancer subtypes that were not identified by PAM50 and HV. Additionally, our analysis discovered three survival-associated luminal-A subgroups and two survival-associated luminal-B subgroups. The study indicates that screening clinically relevant gene expressions yields improved breast cancer stratification. MDPI 2019-02-11 /pmc/articles/PMC6385100/ /pubmed/30754661 http://dx.doi.org/10.3390/molecules24030631 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
He, Zongzhen
Zhang, Junying
Yuan, Xiguo
Xi, Jianing
Liu, Zhaowen
Zhang, Yuanyuan
Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables
title Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables
title_full Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables
title_fullStr Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables
title_full_unstemmed Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables
title_short Stratification of Breast Cancer by Integrating Gene Expression Data and Clinical Variables
title_sort stratification of breast cancer by integrating gene expression data and clinical variables
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6385100/
https://www.ncbi.nlm.nih.gov/pubmed/30754661
http://dx.doi.org/10.3390/molecules24030631
work_keys_str_mv AT hezongzhen stratificationofbreastcancerbyintegratinggeneexpressiondataandclinicalvariables
AT zhangjunying stratificationofbreastcancerbyintegratinggeneexpressiondataandclinicalvariables
AT yuanxiguo stratificationofbreastcancerbyintegratinggeneexpressiondataandclinicalvariables
AT xijianing stratificationofbreastcancerbyintegratinggeneexpressiondataandclinicalvariables
AT liuzhaowen stratificationofbreastcancerbyintegratinggeneexpressiondataandclinicalvariables
AT zhangyuanyuan stratificationofbreastcancerbyintegratinggeneexpressiondataandclinicalvariables