Cargando…
Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4052785/ https://www.ncbi.nlm.nih.gov/pubmed/24949450 http://dx.doi.org/10.1155/2014/459203 |
_version_ | 1782320288442089472 |
---|---|
author | Saini, Ashish Hou, Jingyu Zhou, Wanlei |
author_facet | Saini, Ashish Hou, Jingyu Zhou, Wanlei |
author_sort | Saini, Ashish |
collection | PubMed |
description | Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. Methods. We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (∼2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. Results. The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms. |
format | Online Article Text |
id | pubmed-4052785 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-40527852014-06-19 Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data Saini, Ashish Hou, Jingyu Zhou, Wanlei Biomed Res Int Research Article Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. Methods. We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (∼2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. Results. The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms. Hindawi Publishing Corporation 2014 2014-05-14 /pmc/articles/PMC4052785/ /pubmed/24949450 http://dx.doi.org/10.1155/2014/459203 Text en Copyright © 2014 Ashish Saini et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Saini, Ashish Hou, Jingyu Zhou, Wanlei Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data |
title | Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data |
title_full | Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data |
title_fullStr | Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data |
title_full_unstemmed | Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data |
title_short | Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data |
title_sort | breast cancer prognosis risk estimation using integrated gene expression and clinical data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4052785/ https://www.ncbi.nlm.nih.gov/pubmed/24949450 http://dx.doi.org/10.1155/2014/459203 |
work_keys_str_mv | AT sainiashish breastcancerprognosisriskestimationusingintegratedgeneexpressionandclinicaldata AT houjingyu breastcancerprognosisriskestimationusingintegratedgeneexpressionandclinicaldata AT zhouwanlei breastcancerprognosisriskestimationusingintegratedgeneexpressionandclinicaldata |